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Abstract 

This  Phase  II  effort  completed  the  successful  enhancement,  implementation,  and 
demonstration  of  the  passive  sensor  Altitude  and  Terrain  Awareness  System  (ATAS)  operation 
using  both  computer  generated  and  real  imagery.  The  products  of  this  effort  include:  1)  a  test  bed 
and  an  overall  architecture  of  ATAS  that  comprises  both  the  real-time  image  generation  and  real¬ 
time  egomotion  and  terrain  shape  estimation;  and  2)  hardware  and  software  specifications  for  a 
real-time  flight  rated  ATAS.  The  test  bed  is  composed  of  a  flat-terrain  image  generator  that 
produces  CGI  frames  simulating  perspective  images  viewed  by  an  observer  (camera)  flying  over 
flat  terrain  decorated  by  arbitrary  sinusoidal  patterns,  a  rolling-terrain  image  generator  that 
produces  video  (analog)  imagery  simulating  images  viewed  by  an  observer  flying  over  a 
sinusoidal  rolling  terrain  decorated  by  sinusoidal  patterns,  and  a  frame  grabber  using  hardware 
and  software  image  grabbing  tools  hosted  by  a  Macintosh  Quadra  840  AV  computer.  ATAS  is 
composed  of  an  image  processor;  a  snapshot  egomotion  and  terrain  shape  estimator;  and  a 
Kalman  filter,  to  provide  smoother  high-accuracy  on-line  estimates.  Furthermore,  we  developed 
two  types  of  egomotion  and  terrain  shape  estimators  to  deal  with  different  kinds  of  terrain 
conditions:  a  geometry-based  estimator  and  a  structure-based  estimator.  We  conducted  a 
successful  on-line  demonstration  of  the  ATAS  ground-based  prototype,  driving  the  system  with 
CGI  and  flight-recorded  imagery,  generating  running  estimates  of  altitude  and  attitude  with 
respect  to  the  overflown  terrain.  The  evaluation  results  indicate  that  ATAS  has  a  capability  for 
accurate,  efficient,  and  robust  egomotion  and  terrain  shape  estimation  over  various  terrain  (flat 
and  non-flat)  and  under  different  motion  conditions,  when  the  proper  estimators  (the  geometry- 
based  estimator  or  the  structure-based  estimator)  and  the  proper  pixellation  sizes  are  selected. 
The  Phase  n  findings  demonstrated  the  overall  capability  of  ATAS  and  established  benchmark 
criteria  for  evaluating  future  systems.  The  effort  demonstrated  operation  with  off-the-shelf 
hardware,  at  performance  levels  that  can  significantly  enhance  critical  on-board  functions, 
including  passive  navigation,  optically-based  TF/TA,  and  FLIR-based  target  designation  and 
weapons  control. 

Significant  conunercial  potential  exists  for  ATAS,  both  in  military  and  non-military 
applications.  In  the  military  applications,  the  passive  motion  and  terrain  shape  estimation 
capability  provided  by  ATAS  can  offer  significant  improvement  to  pilot’s  TF/TA  subsystems, 
can  drive  helmet-mounted  displays  for  pilot  terrain  awareness,  and  can  enhance  stealth  operation. 
In  the  non-military  applications,  the  same  passive  motion  and  terrain  shape  estimation  capability 
can  play  even  a  larger  role  in  the  development  of  the  robot  navigation/guidance,  automatic 
vehicle  collision  avoidance,  and  virtual  reality  systems. 
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1.  INTRODUCTION 

Terrain-following  (TF)  flight  is  performed  when  entering  or  leaving  airspace  in  high  threat 
battle  areas.  The  intent  is  the  timely  movement  towards  an  objective  while  using  the  terrain  for 
masking.  It  combines  vertical  path  control  for  following  terrain  contours  with  occasional  lateral 
path  control  for  heading  changes  and  obstacle  avoidance.  This  difficult  and  dangerous  task 
demands  a  continuing  awareness  and  control  of  one's  own  altitude,  attitude,  and  velocity  with 
respect  to  the  terrain,  and  it  results  in  high  crew  workload. 

To  help  reduce  this  workload,  we  have  designed  and  evaluated  a  passive  sensor  Altitude  and 
Terrain  Awareness  System  (ATAS),  which,  unlike  conventional  radar-  or  laser-based  systems, 
operates  with  passively-sensed  images  to  provide  a  real-time  estimate  of  terrain-relative  altitude. 
A  system  built  around  a  passive  imager  rather  than  an  active  sensor  significantly  enhances  covert 
operation  and  reduces  countermeasure  sensitivity.  It  has  the  potential  for  reducing  the  workload 
associated  with  the  actual  TF  flight  control  and  guidance  tasks,  without  incurring  an  increased 
workload  penalty  in  other  areas,  particularly  in  threat  vigilance  and  defensive  weapons  control. 

The  TF  environment  imposes  stringent  requirements  on  any  proposed  passive-imager  system. 
First,  the  system  must  be  capable  of  operating  with  information-dense  images  of  an  ever- 
changing  battlefield  environment,  images  which  are  orders-of-magnitude  more  complex  than 
what  might  be  expected  in,  say,  a  multi-aircraft  air-to-air  engagement.  Second,  the  system  must 
be  relatively  robust  with  respect  to  image  noise  and  be  capable  of  operating  in  the  presence  of 
smoke,  flares,  and  other  incidental  and  deliberate  sources  which  act  to  degrade  the  sensor  image. 
Third,  the  system  must  operate  in  real-time  at  sample  rates  needed  to  drive  the  pilot's  display  at 
adequate  rates.  Finally,  to  ensure  simplicity  and  reliability,  the  system  must  be  relatively 
autonomous  of  other  on-board  systems,  but  capable  of  incorporating  exogenous  subsystem 
information  available  from  sources  such  as  the  inertial  navigation  system  (INS),  an  on-board 
Defense  Map  Agency  data  base,  etc. 

The  system  which  can  best  satisfy  all  of  these  requirements  is  a  passive  imaging  system 
which  uses  the  feature-free  optical  flow-field  present  in  dynamic  image  sequences,  and 
illustrated  in  figure  1-1.  This  flow-field  arises  because  of  the  relative  motion  of  the  aircraft  with 
respect  to  the  scene,  causing  the  sensed  image  to  flow  with  time,  as  the  viewing  geometry 
changes  with  time.  Flow-fields  are  a  function  of  the  instantaneous  viewing  geometry  (the 
location  and  orientation  of  the  imager  with  respect  to  the  visual  world,  the  imager's  field-of-view 
and  Foresight  angle,  etc.),  the  rate  of  change  of  this  viewing  geometry  (the  linear  and  angular 
velocity  of  the  vehicle),  and  the  intrinsic  geometry  of  the  viewed  world.  Flow-fields  can  be  quite 


1 


R92  1 9 1  Charles  River  Analytics 

complex,  but  the  richness  of  the  information  contained  in  them  makes  them  ideally  suited  to 
support  a  real-time  TF  ATAS  using  passively-sensed  optical-flow. 


Figure  1-1:  Image  Flow  Associated  with  a  Dynamically  Changing  Image 


The  overall  system  concept  of  a  passive  ATAS  operating  in  a  low-level  flight  mode  is 
illustrated  in  figure  1-2,  and  a  conventional  modular  ATAS  architecture  is  shown  in  figure  1-3.  A 
forward-looking  passive  imager  provides  sequences  of  real-time  two-dimensional  dynamic 
images  of  the  approaching  terrain.  A  flow-field  computer  then  generates  the  image  flow  arising 
from  the  aircraft's  motion  over  the  terrain.  The  computed  image  flow-field  is  then  processed  by  a 
central  estimator,  which  generates  on-line  estimates  of  the  relevant  aircraft  states  and  terrain 
shape  parameters.  This  latter  set  includes  a  three-dimensional  impact  time  (depth)  map  that  gives 
the  flight  time  to  all  points  in  the  imager  field-of-view,  where  the  impact  time  is  defined  as  the 
speed-scaled  range  or  depth.  In  effect,  the  system  generates  an  instantaneous  estimate  of  the 
topology  and  shape  of  the  approaching  terrain.  This  information  is  then  fed  to  a  terrain  shape  and 
altimeter  module,  which  then  determines  terrain  shape  and  terrain-relative  altitude.  These  are 
then  displayed  to  the  pilot  via  a  helmet-mounted  display  to  provide  continuous  terrain  awareness. 
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Figure  1-2:  System  Concept  for  Passive  Sensor  Altitude  and 
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Figure  1-3:  A  Conventional  ATAS  Architecture 


1.1  Technical  Objectives 

The  primary  objective  of  this  Phase  II  SBIR  effort  was  to  develop  and  demonstrate  a 
prototype  version  of  the  passive  sensor  Altitude  and  Terrain  Awareness  System  (ATAS).  Five 
goals  were  achieved: 

•  Enhancement  of  the  Predecessor  Phase  I  Design 

•  Implementation  a  ATAS  Ground  Based  Demonstrator 

•  Development  of  a  Test  Bed 

•  Evaluation  of  the  ATAS  Ground  Based  Demonstrator 

•  Specification  of  Design  Requirements  for  a  Production  System 

We  now  describe  these  Phase  II  technical  objectives  in  the  following  paragraphs. 

The  first  objective  of  the  Phase  II  effort  was  to  enhance  the  Phase  I  ATAS  design  for 
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accurate,  and  robust  egomotion  and  terrain  shape  estimation  in  realistic  flight  situations.  The 
focus  here  was  to  upgrade  the  Phase  I  ATAS  design — whose  applicability  was  limited  to  flat 
terrain  and  constant  motion — to  situations  involving  non-flat  terrain  and  non-constant  motion  as 
occurs  in  realistic  TF  situations. 

The  second  objective  of  the  Phase  II  effort  was  to  implement  a  ground  based  demonstrator 
(GBD)  for  ATAS.  The  focus  here  was  to  ensure  a  modular  and  error-free  implementation  of 
ATAS  in  a  conventional  C  programming  language. 

The  third  objective  of  the  Phase  II  effort  was  to  develop  a  test  bed  for  ATAS.  The  focus  here 
was  to  configure  the  hardware  for  image  grabbing,  to  implement  the  computer  image  generators 
(CGI)  for  terrain  image  generation,  and  to  develop  performance  metrics  for  system  evaluation. 

The  fourth  objective  of  the  Phase  II  effort  was  to  evaluate  the  ATAS  ground-based 
demonstrator  using  the  test  bed  developed.  The  focus  here  was  the  extensive  evaluation  using 
both  the  CGI  and  real  imagery.  Specifically,  testing  of  the  ATAS  GBD  was  conducted  using 
computed  generated  imagery  from  the  test  bed,  and,  most  significantly,  using  flight-recorded  and 
real-time  frame-grabbed  imagery  from  aircraft  flying  over  actual  terrain. 

The  third  and  last  objective  of  the  Phase  II  effort  was  to  develop  the  design  requirements  for 
a  production  system  to  be  developed  under  a  Phase  III  effort.  This  included  the  specification  of 
the  hardware  requirements  and  generation  of  the  software  specifications.  The  commercial 
application  of  this  design  requirement  package  would  be  a  Phase  III  development  effort  to  build 
a  production  system  for  a  major  avionics  supplier  to  the  air  transport  industry. 

1.2  Technical  Approach 

Our  Phase  II  technical  approach  to  developing  and  demonstrating  a  prototype  version  of  the 
passive  sensor  altitude  and  terrain  awareness  system  (ATAS)  was  comprised  of  five  tasks: 

•  Review  of  Current  Approaches  to  Flow-Based  Egomotion  and  Terrain  Shape  Estimation 

•  Enhancement  and  Development  of  a  Prototype  ATAS 

•  Development  of  a  Test  Bed  for  ATAS 

•  Demonstration  and  Evaluation  of  System  Performance  with  CGI  and  flight-recorded 
imagery 

•  Recommendation  of  development  path  for  a  commercial  product 

We  first  reviewed  current  approaches  and  evaluated  their  advantages  and 
disadvantages.  A  literature  search  was  conducted  specifically  focusing  on  flow-field  based 
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estimation  technologies.  A  review  and  summary  of  the  most  promising  studies  were  conducted  to 
identify  capabilities  and  limitations  of  earlier  flow-field  based  approaches. 

We  then  designed  and  implemented  a  prototype  ground  based  ATAS  via  specifying  its 
overall  architecture,  its  functional  modules,  and  its  software  implementation.  Our  efforts  were 
focused  on  the  development  of  a  robust,  accurate,  and  efficient  ATAS  that  can  work  in  real-time. 
A  Kalman  filter  based  direct  approach  was  developed  that  comprised  of  three  modules:  1)  an 
image  processor;  2)  a  snapshot  egomotion  and  terrain  shape  estimator;  and  3)  a  Kalman  filter 
providing  smoothed  estimates. 

We  also  developed  and  implemented  a  test  bed  that  has  a  capability  for  flat  and  non-flat 
computer  terrain  image  generation.  We  defined  and  implemented  various  performance  metrics 
for  system  evaluation. 

Following  implementation  of  the  ground  based  ATAS  and  its  test  bed,  we  demonstrated 
and  evaluated  system  performance  via  an  extensive  testing  program.  ATAS  was  tested  using 
both  CGI  and  real  imagery,  and  using  both  the  flat  and  non-flat  terrain  imagery.  It  was  also  tested 
against  human  experiments. 

Finally,  we  recommended  a  development  path  for  a  commercial  product,  based  on  the 
results  of  the  evaluation  effort.  The  development  path  includes  an  enhanced  performance  version 
of  the  Phase  II  design,  and  demonstration  and  performance  evaluation  over  an  extended  CGI  and 
real  imagery  database. 

1.3  Summary  of  Results 

The  primary  result  of  this  Phase  II  effort  was  the  successful  enhancement,  implementation, 
and  demonstration  of  the  ATAS  operation  using  both  computer  generated  and  real  imagery.  The 
major  findings  supporting  this  effort  can  be  summarized  as  follows. 

We  developed  a  test  bed  and  an  overall  architecture  of  ATAS  that  comprises  both  the  real¬ 
time  image  generation  and  real-time  egomotion  and  terrain  shape  estimation.  We  developed  and 
implemented  a  flat-terrain  image  generator  that  produces  CGI  frames  simulating  perspective 
images  viewed  by  an  observer  (camera)  flying  over  flat  terrain  decorated  by  arbitrary  sinusoidal 
patterns.  We  also  developed  and  implemented  a  rolling-terrain  image  generator  that  produces 
video  (analog)  imagery  simulating  images  viewed  by  an  observer  flying  over  a  sinusoidal  rolling 
terrain  decorated  by  sinusoidal  patterns.  In  addition,  from  various  sources  we  obtained  other 
complex  CGI  terrain  images  used  widely  for  computer  vision  algorithm  testing.  For  real  image 
generation,  we  obtained  video  imagery  using  a  video  camera  fixed  on  a  helicopter  flying  over 
real  terrain.  To  convert  the  analog  video  imagery  into  digitized  image  frames,  we  implemented  a 
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frame  grabber  using  hardware  and  software  image  grabbing  tools  hosted  by  a  Macintosh  Quadra 
840  AV  computer. 

For  egomotion  and  terrain  shape  estimation,  we  developed  a  three  stage  modular  architecture 
comprised  of  an  image  processor;  a  snapshot  egomotion  and  terrain  shape  estimator;  and  a 
Kalman  filter,  to  provide  smoothed  high-accuracy  on-line  estimates.  This  modular  approach 
provided  performance  improvements  and  simplification  for  system  implementation  over  the 
original  design  developed  under  the  Phase  I  effort.  First,  it  avoids  the  complicated  interactions 
between  low-level  image  processing  and  high-level  motion  state  and  terrain  shape  estimation. 
Second,  separation  of  image  processing,  snapshot  estimation,  and  Kalman  filtering  makes  each 
problem  simpler  and  its  solution  easier  and  more  robust.  Third,  the  system  has  a  natural  interface 
to  incorporate  external  information  for  performance  enhancement.  For  example,  if  the  external 
information  on  observer  motion  state  is  available,  it  can  then  be  easily  incorporated  into  the 
Kalman  filter  without  reconfiguration  of  the  entire  system.  Finally,  we  believe  that  this  modular 
architecture  can  provide  the  basis  for  a  functional  model  of  the  human  flow-field  based  vision 
process. 

We  developed  two  types  of  egomotion  and  terrain  shape  estimators  to  deal  with  different 
kinds  of  terrain  conditions:  a  geometry-based  estimator  and  a  structure-based  estimator.  The 
geometry-based  estimator  assumes  that  the  terrain  shape  satisfies  a  polynomial  geometry,  and 
uses  this  model  for  terrain  shape  estimation.  The  structure  based  estimator  does  not  assume  a 
priori  a  terrain  model,  but  extracts  the  hidden  structural  relationship  among  imaged  terrain  points 
using  a  multi-layer  terrain  structure  model. 

For  static  estimation  in  the  snapshot  estimator  and  dynamic  estimation  in  the  Kalman  filter, 
we  employed  and  implemented  two  state-of-the-art  estimation  algorithms:  the  Singular  Value 
Decomposition  (SVD)  based  Least  Squares  (LS)  algorithm  and  the  Square-Root  Kalman  filter 
algorithm.  The  application  of  these  two  numerically  robust  algorithms  effectively  overcomes  any 
numerical  instabilities  that  might  be  present  in  the  estimation  process. 

Furthermore,  the  operation  of  AT  AS  needs  specification  of  only  a  few  meaningful 
parameters.  For  the  geometry  based  approach,  only  four  parameters  need  to  be  specified:  the 
terrain  model,  the  initial  heading  vector,  the  model  standard  deviations  for  motion  state  dynamic 
equations,  and  the  discount  factor  for  terrain  impact  time  (range)  map  evolution;  for  the  structure 
based  approach,  only  the  latter  two  parameters  need  to  be  specified.  In  contrast,  many  more 
parameters  would  be  needed  in  a  flow  or  feature  matching  based  system,  many  of  which  need  to 
be  determined  by  trial  and  error. 

We  conducted  a  successful  on-line  demonstration  of  the  ATAS  ground-based  prototype. 
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driving  the  system  with  CGI  and  flight-recorded  imagery,  generating  running  estimates  of 
altitude  and  attitude  with  respect  to  the  overflown  terrain.  The  evaluation  results  indicate  that 
ATAS  has  a  capability  for  accurate,  efficient,  and  robust  egomotion  and  terrain  shape  estimation 
over  various  terrain  (flat  and  non-flat)  and  under  different  motion  conditions,  when  the  proper 
estimators  (the  geometry-based  estimator  or  the  structure-based  estimator)  and  the  proper 
pixellation  sizes  are  selected. 

Using  CGI,  we  evaluated  ATAS  for  both  flat  and  non-flat  terrain  images.  With  the  flat 
terrain,  ATAS  had  estimation  errors  of  less  than  2  degree  in  aimpoint,  less  than  3  degree/second 
in  angular  rate,  and  less  than  5%  in  for  altitude  (impact  time). 

With  the  non-flat  terrain  images,  we  used  a  complex  sequence  that  was  generated  by 
simulating  a  flight  over  the  Yosemite  valley.  ATAS  demonstrated  an  excellent  ability  to  recover 
*  the  complicated  Yosemite  valley  structure,  whose  complexity  is  far  beyond  the  capability  of  any 
polynomial  geometry  terrain  models  (flat,  quadratic,  etc.).  Moreover,  ATAS  had  a  less  than  5 
degree  error  in  aimpoint,  and  a  less  than  1  degree/second  error  in  angular  rate,  for  these 
complicated  non-flat  terrain  images. 

For  the  flight-recorded  and  real-time  grabbed  terrain  images,  even  though  they  were  poorly 
“decorated”,  had  very  high  noise-to-signal  ratios,  and  the  observer  (airplane)  was  not  undergoing 
constant  motion,  ATAS  was  still  capable  of  generating  accurate  egomotion  and  terrain  shape 
estimates.  Specifically,  it  demonstrated  errors  of  less  than  5  degree  in  aimpoint,  less  than  2 
degree/second  in  angular  rate,  and  less  than  10%  in  altitude  (impact  time).  Moreover,  ATAS 
showed  an  excellent  ability  in  rapidly  following  and  tracking  the  non-constant  observer  motion. 
Besides  generating  overall  terrain  shape  estimates  such  as  speed-scaled  altitude,  ATAS  was  also 
capable  of  recovering  the  shape  of  local  objects  (in  this  case,  trucks  present  in  the  terrain 
images). 

We  also  compared  the  advantages  and  disadvantages  of  the  geometry-based  and  structure- 
based  estimators,  summarized  in  table  1.3-1. 

The  geometry-based  estimator  can  only  be  applied  to  a  terrain  that  can  be  approximated  by  a 
simple  geometry  model.  Moreover,  using  a  geometry-based  model  to  approximate  the  entire 
terrain  within  the  FOV  makes  it  impossible  to  recover  local  structures  hidden  in  the  terrain 
image,  such  as  isolated  objects.  In  contrast,  the  structure-based  approach  has  no  limitation  in  its 
applicability.  It  can  be  applied  to  both  simple  and  complicated  terrain.  Moreover,  it  can  be  used 
to  recover  the  local  structure  of  isolated  objects  in  the  imager  FOV. 

Large  pixels  can  be  used  for  image  measurement  computation  in  the  geometry-based 
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estimator.  Large  pixels  not  only  reduce  the  computation  load  greatly,  but  also  enhance  overall 
estimation  accuracy.  The  structure-based  estimator,  however,  requires  small  pixels  to  properly 
define  the  terrain  structure  and  thus  incurs  additional  computational  costs. 

The  geometry-based  estimator  has  superior  performance  over  the  structure-based  estimator 
when  the  actual  terrain  can  be  approximated  by  a  simple  geometry.  In  this  case,  the  geometry- 
based  model  represents  a  true  description  of  the  terrain  shape  being  imaged.  The  structure-based 
estimator,  however,  always  has  an  approximate  representation  of  the  actual  terrain,  no  matter 
how  simple  or  complicated  it  actually  is. 


Table  1.3-1:  Comparison  of  Geometry-Based  and  Structure-Based  Estimators 


Applicability 

Geometry-Based 

Simple  geometry  terrain 

Works  Better  for  large  pixel 

Better  for  simple  geometry 
terrain 

Structure-Based 

Unlimited 

Can  recover  local  structure 

Requires  small  pixel 

Better  for  complicated 
terrain  or  local  structures 

Using  ATAS,  we  also  analyzed  several  previous  experimental  studies  of  human  egomotion 
perception.  For  the  two  reported  on  here,  we  used  simple  additive  image  noise  to  model  the 
human's  inability  to  register  perfectly  noise-free  images,  and  we  were  able  to  show  how  this 
perceptual  noise  leads  to  corresponding  inaccuracies  in  estimates  of  egomotion  state.  In 
particular,  we  showed  how,  within  the  model  context,  an  assumed  10%  to  15%  image  noise  level 
leads  directly  to  model-predicted  egomotion  aimpoint  accuracy  in  the  experimentally  observed 
range  of  1.5  to  2.5  degree.  In  addition,  we  showed  how  these  aimpoint  accuracy  can  be  expected 
to  co-vary  with  velocity-to-height  ratios,  which  change  by  over  a  factor  of  10,  and  confirmed 
these  trends  with  experimental  data  showing  the  same  trends.  Although  more  model  validation  is 
clearly  needed,  we  feel  that  these  initial  matches  between  model  and  data  are  very  encouraging. 

Finally,  we  identified  requirements  for  production  system  development  and  demonstration. 
Specifically,  we  recommended  that  the  Phase  II  laboratory  prototype  serve  as  the  basis  for  the 
development  of  a  flight  prototype  system,  to  be  used  to  demonstrate  in-flight  capabilities  and 
system  potential.  A  two-step  development  and  demonstration  program  is  envisioned: 
1)  development  and  demonstration  of  a  flight  prototype  system  configured  for  in-flight  test  and 
evaluation;  and  2)  specification  of  design  requirements  for  a  follow-on  production  system. 

The  Phase  II  findings  demonstrated  the  overall  capability  of  ATAS  and  established 
benchmark  criteria  for  evaluating  future  systems.  The  effort  demonstrated  operation  with  off-the- 
shelf  hardware,  at  performance  levels  that  can  significantly  enhance  critical  on-board  functions, 
including  passive  navigation,  optically-based  TF/TA,  and  FLIR-based  target  designation  and 
weapons  control. 
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1.4  Report  Outline 

Chapter  2  provides  technical  background  on  the  proposed  passive  sensor  Altitude  &  Terrain 
Awareness  System  (AT AS).  Section  2.1  reviews  major  flow-field  computer  vision  approaches  to 
the  egomotion  and  terrain  estimation  problem,  and  outlines  their  advantages  and  disadvantages. 
Section  2.2  sketches  our  approach,  and  presents  its  enabling  technologies. 

Chapter  3  gives  a  technical  description  of  ATAS.  Section  3.1  presents  an  overview  of  the 
system  and  its  two  component  modules:  an  image  generator  and  an  egomotion  and  terrain  shape 
estimator.  Section  3.2  then  describes  technical  details  of  the  image  generator,  while  section  3.3 
describes  the  egomotion  and  terrain  shape  estimator. 

Chapter  4  demonstrates  and  evaluates  overall  operation  of  the  ground-based  ATAS.  We  show 
that  ATAS  can  accurately  generate  navigation-critical  state  and  terrain  shape  information,  such 
as  heading,  angular  velocity,  and  impact  time  (depth)  map  across  the  FOV.  Section  4.1  describes 
the  real-time  implementation  of  the  ATAS  system  and  define  the  performance  evaluation 
criterion.  Section  4.2  presents  evaluation  results  using  flat  terrain  CGI,  section  4.3  presents  the 
evaluation  results  using  non-flat  terrain,  and  section  4.4  presents  the  evaluation  results  using 
flight-recorded  imagery.  Section  4.5  concludes  the  chapter  with  a  summary  of  the  evaluation 
results. 

Chapter  5  concludes  the  report  with  a  summary,  conclusions,  and  recommendations  for 
further  development  and  demonstration. 
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2.  LITERATURE  REVIEW  AND  ENABLING  TECHNOLOGIES 

This  chapter  provides  technical  background  on  the  proposed  passive  sensor  Altitude  and 
Terrain  Awareness  System  (ATAS).  Section  2.1  reviews  major  flow-field  computer  vision 
approaches  to  the  egomotion  and  terrain  estimation  problem,  and  outlines  their  advantages  and 
disadvantages.  Section  2.2  sketches  our  approach,  and  presents  its  enabling  technologies 

2.1  Review  of  Computer  Vision  Approach  to  Flow  Based  Motion  and  Terrain  Estimation 

Computer  vision  approaches  to  flow  based  motion  and  terrain  estimation  problems  typically 
formulate  the  problem  as  a  common  two-stage  process:  first,  compute  the  optical /Zovv-yieW  based 
on  the  dynamically  changing  images;  second,  estimate  the  egomotion  states  and  the  relative 
depth/shape  based  on  the  computed  flow  and  possibly  on  a  model  of  the  viewed  terrain.  The 
approach  provides  a  convenient  divide  and  conquer  approach  to  egomotion  and  shape  estimation. 
One  group  of  researchers  can  thus  concentrate  on  flow-field  computation  without  considering  its 
consequences  for  motion  estimation,  while  another  group  can  concentrate  on  the  estimation  of 
motion  states  and  terrain  shape,  assuming  availability  of  an  accurately  computed  flow-field 
without  worrying  about  the  feasibility  of  its  generation. 

There  are  a  number  of  algorithms  for  computing  the  optical  flow-field,  given  a  sequence  of 
dynamically  changing  images.  These  algorithms  can  be  roughly  grouped  into  feature-based, 
gradient-based,  and  frequency-based  approaches.  Table  2.1-1  summarizes  the  advantages  and 
disadvantages  of  each  approach. 


Table  2.1-1:  Advantages  and  Disadvantages  of  Various  Flow-Field  Computation  Methods 


:lSgeth<Kl-  -  ^ 

Advantages"' 

■  ;  - 

illii 

Feature-based 

Lowest  computation  load 

Difficulty  in  matching  features 

Sparse  flow-field 

Not  suitable  for  featureless  imagery 

Gradient-based 

Easy  Implementation 

Wide  applicability 

Worst  performance 

High  computation  load 

Frequency-based 

Best  performance 

Highest  computation  load 

A  priori  imagery  information 
required 

Large  number  of  frames 

The  feature-based  approach  extracts  the  flow-field  by  matching  features  from  one  image 
frame  to  the  next  and  determining  their  shifts  (Anandan,  1989;  Singh,  1992;  Sridhar  &  Chatteiji, 
1994).  The  approach  has  the  advantage  of  the  lowest  computational  requirement  among  the  three 
approaches,  since  flow-fields  are  only  computed  for  the  sparse  set  of  features  in  the  image  and 
shift  computation  involves  only  a  few  arithmetical  operations.  However,  it  has  the  disadvantage 
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of  great  difficulty  in  matching  features  from  one  image  to  the  next  due  to  uncertainties  caused  by 
occlusion,  observer  motion,  and  image  sensor  noise.  Moreover,  the  approach  provides  only  a 
sparse  subset  of  the  entire  flow-field  that  depends  on  the  number  of  recognizable  features  in  an 
image,  and  cannot  be  used  in  featureless  images. 

The  gradient-based  approach  computes  the  flow-field  vector  at  each  pixel  from  spatiotemporal 
gradients  of  (filtered  or  original)  image  intensity  (Horn  &  Schunck,  1981;  Nagel,  1983;  Lucas  & 
Kanade,  1981).  The  approach  has  the  advantage  of  ease  of  implementation  in  both  sequential  and 
parallel  computers  since  computation  of  spatiotemporal  gradients  at  a  pixel  is  simple  and 
involves  only  a  few  neighbor  pixels.  The  approach  also  has  the  advantage  of  applicability  to  both 
feature-rich  and  featureless  images.  However,  it  has  the  disadvantage  of  resulting  in  the  worst 
performance  in  the  case  of  non-smooth  images  due  to  occlusion  boundaries  and  perceptual 
distortion.  Furthermore,  since  the  gradient-based  approach  usually  must  use  an  iterative 
computation  method  to  improve  the  quality  of  the  computed  flow-field,  the  computational  load 
for  the  approach  can  be  high. 

The  frequency-based  approach  generates  the  flow-field  vector  using  velocity-tuned  filters  in 
the  Fourier  domain  (Heeger,  1987;  Fleet,  1992;  Fleet  &  Jepson,  1990).  By  using  large  numbers 
of  images  and  rather  stable  frequency  domain  image  features  (energy  or  phase),  the  approach,  in 
general,  leads  to  the  most  accurate  flow-field  recovering.  The  approach,  however,  has  the  highest 
computational  load,  requires  a  priori  motion  and  image  information  in  designing  the  filters,  and 
is  the  most  difficult  to  implement  since  the  filters  must  be  tuned  according  to  the  image 
spectrum. 

Real-time  operation  and  lack  of  a  priori  motion  and  image  (terrain)  information  limited  our 
selection  of  flow-field  computation  algorithms  to  feature-based  or  gradient-based  approaches. 
Existing  flow-field  based  egomotion  and  terrain  estimation  technique  requires  a  relative  error  in 
the  computed  flow-field  to  be  less  than  10%  the  reliable  egomotion  and  terrain  estimation 
according  to  Barron,  Jepson  &  Tsotsos  (1990).  What  can  be  achieved  by  current  flow-field 
computation  algorithms,  however,  is  not  encouraging.  Barron,  Fleet  &  Beauchemin  (1994) 
conducted  quantitative  evaluation  of  three  approaches.  Results  show  that  only  the  well-tuned 
frequency-based  algorithms  could  achieve  the  required  accuracy,  whereas  both  the  feature-based 
and  gradient-based  algorithms  had  average  errors  close  to  or  above  10%  even  for  synthetic 
(computer-generated)  flat-terrain  images.  The  same  above  10%  average  errors  were  also  reported 
in  quantitative  evaluation  of  the  feature-based  and  gradient-based  algorithms  on  two  relatively 
simple  synthetic  image  cases  in  Little  &  Verri  (1989). 

There  are  also  a  number  of  algorithms  to  estimate  motion/shape  parameters,  given  the 
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computed  flow-field.  These  algorithms  can  be  roughly  grouped  into  instantaneous  (snapshot)  and 
Kalman  filter  estimators.  Of  the  instantaneous  estimators,  Zacharias,  Caglayan  &  Sinacori  (1988) 
developed  a  quasi-static  estimator  to  estimate  an  observer's  instantaneous  heading  (vertical  and 
lateral),  angular  velocity,  and  relative  depth  or  impact  times  (the  speed-scaled  range  or  3D  shape) 
by  decomposing  the  motion  state  and  terrain  shape  estimation  problems.  Heeger  &  Jepson  (1992) 
split  the  flow-field  equation  via  algebraic  manipulation  into  three  sets  of  equations.  The  first  set 
relates  the  flow  field  to  only  the  translational  (heading)  motion.  Thus,  relative  depth  and  rotation 
motion  (angular  velocity)  need  not  be  known  prior  to  estimating  the  translational  motion.  Once 
the  translational  motion  has  been  determined,  the  second  set  of  equations  relating  to  both 
translational  and  rotational  motion  is  then  used  for  estimating  rotation.  Finally,  the  depth  at  each 
pixel  is  estimated  using  the  third  set  of  equation,  given  the  estimated  translation  and  rotation. 

Kalman  filter  based  methods  use  the  computed  flow-field  vectors  as  observations,  and 
integrate  them  with  the  motion/shape  parameters  over  multiple  image  frames  to  achieve  robust 
estimation  of  motion/shape  parameters.  Various  schemes  have  been  developed  by  Broida  & 
Chellappa  (1986);  Wu,  Rink,  Caelli  &  Gourishankar  (1989);  Sridhar  &  Chatterji  (1994),  and 
Matthies,  Szelisky  &  Kanada  (1994).  Two  fundamental  problems  here  are:  1)  the  need  to  track 
and  establish  correspondence  over  a  larger  numbers  of  image  points  or  features  that 
appear/disappear  or  occlude  each  other;  and  2)  the  numerical  instability  of  the  filtering  process 
and  the  high  computational  cost  due  to  the  high  dimensional  nonlinear  dynamics  of  the 
motion/shape  parameters.  These  have  restricted  the  application  of  Kalman  filtering  mostly  to  the 
use  of  feature-based  flow-fields.  Even  in  these  cases,  some  promising  results  have  been  achieved 
in  dealing  with  real  images,  but  only  when  accurate  motion  states  are  known  (Sridhar  & 
Chatterji,  1994).  Extension  of  the  Kalman  filter  approach  to  non-feature  based  approaches  was 
proposed  by  Matthies  et  al.  (1994),  but  again,  accurate  motion  states  were  also  assumed  to  be 
known. 

Noticing  difficulties  in  accurately  and  efficiently  estimating  flow-fields,  Negahdaripour 
(1986),  Negahdaripour  &  Horn  (1989),  and  Zacharias,  Miao  &  Warren  (1995)  developed  direct 
approaches  that  estimate  egomotion  and  terrain  shape  without  computing  the  flow-field.  These 
direct  approaches,  however,  only  consider  simple  geometry  (flat,  quadratic,  etc.)  terrain  and 
constant  motion. 

2.2  Enabling  Technologies 

We  take  a  direct  approach  to  AT AS  development  to  avoid  flow-field  estimation,  and  its 
associated  estimation  errors  and  computational  load.  The  approach,  however,  requires  the  design 
of  a  new  estimation  architecture  and  the  development  of  new  solution  technologies  since  the 
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earlier  flow-field  based  estimation  architecture  and  its  associated  solution  technologies  cannot  be 
applied  without  first  computing  the  flow-field. 

In  section  2.2.1,  we  present  a  Kalman  filter  based  direct  approach  to  egomotion  and  terrain 
shape  estimation.  In  sections  2.2.2  and  2.2.3,  we  briefly  describe  two  key  technologies  in  support 
of  the  approach;  Singular  Value  Decomposition  (SVD)  based  Least  Squares  (LS)  algorithms  and 
Square  Root  Kalman  filter  algorithms. 


2.2.1  Kalman  Filter  Based  Direct  Approach  to  Egomotion  and  Terrain  Shape  Estimation 


Figure  2.2-1  illustrates  the  Kalman  filter  based  direct  approach  to  egomotion  and  terrain 
shape  estimation.  The  estimation  system  comprises  three  functional  modules — an  image 
processor,  a  snapshot  estimator,  and  a  Kalman  filter,  to  generate  real-time  motion  and  terrain 
shape  estimates. 


images 


measurements 


snapshot 
motion  &  shape 
estimates 


smoothed 
motion  &  shape 
estimates 


- 

I| 

— 

compute  image 

— ► 

generate  snapshot 

— ► 

generate  smoothed 

measurements 

motion  &  shape 

motion  and  shape 

estimates 

estimates 

Figure  2.2-1:  Egomotion  and  Terrain  Shape  Estimation  System 


The  first  stage  image  processor  computes  image  measurements  each  time  a  new  image  is 
acquired.  These  image  measurements  consist  of  the  first  order  image  gradients  and  results  from 
arithmetic  operations  between  the  image  gradients  and  image  pixel  coordinates  in  an  observer- 
fixed  coordinate  system.  Consequently,  except  for  the  image  gradient  computation,  this  image 
processor  is  significantly  different  from  a  conventional,  first  stage  flow-field  computer 
(Negahdaripour  &  Lee,  1992).  The  flow-field  computation  involves  complicated  numerical 
techniques,  cannot  be  done  accurately,  and  is  very  computationally  intensive.  The  image 
measurement  computation,  on  the  other  hand,  involves  only  few  arithmetic  operations  between 
image  gradients  and  precisely  known  pixel  coordinates,  and  thus  can  be  done  accurately  and 
requires  much  less  computation  than  does  the  flow-field  computation. 

Using  solely  the  current  image  measurements,  the  snapshot  estimator  then  generates  a 
"snapshot"  (instantaneous)  estimate  of  egomotion  states  and  terrain  shape.  The  accuracy  and 
reliability  that  can  be  achieved  by  the  snapshot  estimator,  however,  are  bounded  by  two  factors: 
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1)  the  high  signal-to-noise  ratios  due  to  violation  of  first-derivative  continuous  assumptions  in 
real  images  and  the  limitations  of  digital  imaging  technology;  and  2)  the  small  shift  between 
consecutive  image  frames  that  may  simply  not  contain  enough  information  for  reliable  motion 
and  shape  estimation. 

The  last  stage  a  Kalman  filter  improves  the  accuracy  of  the  snapshot  estimates  by  integrating 
information  over  time  and  exploiting  the  continuity  constraints  in  the  observer's  motion  states 
and  the  terrain  shape. 

This  modular  approach  provides  both  performance  improvements  and  simplifications  for 
system  implementation.  First,  it  avoids  complicated  interactions  between  low-level  image 
processing  and  high-level  motion  state  and  terrain  shape  estimation,  and  consequently  avoids 
nonlinear  relationships  induced  by  such  interactions  in  an  integrated  approach.  Second, 
separation  of  image  processing,  snapshot  estimation,  and  Kalman  filtering  makes  each  problem 
simpler  and  its  solution  easier  and  more  robust.  Third,  the  system  has  a  natural  interface  to 
incorporate  external  information  for  performance  enhancement.  For  example,  if  the  external 
information  on  the  observer  motion  state  is  available,  it  can  then  be  easily  incorporated  into  the 
Kalman  filter  without  reconfiguration  of  the  entire  system.  Finally,  we  believe  that  this  modular 
architecture  can  serve  as  the  basis  of  a  model  of  the  human  flow-field  based  vision  process. 

The  development  of  the  Kalman  filter  based  direct  approach  calls  for  accurate,  robust,  and 
efficient  static  and  dynamic  estimation  algorithms.  The  Singular  Value  Decomposition  (SVD) 
Least  Squares  (LS)  algorithm  and  the  Square-Root  Kalman  filter  algorithm  are  the  two  state-of- 
art  techniques  for  static  and  dynamic  estimation,  and  are  briefly  described  below  in  sections  2.2.2 
and  2.2.3. 

2.2.2  Singular  Value  Decomposition  (SVD)  Least  Squares  (LS)  Estimation  Algorithm 

LS  estimation  is  one  of  the  most  widely  used  static  estimation  techniques.  Modeling  the 
noisy  measurements  of  an  unknown  system  (process)  via  a  parametric  model,  LS  estimation 
provides  a  best  estimate  of  system  parameters  in  the  least  squares  sense.  Formally  speaking,  let  z 
represent  a  column  vector  of  M  unknown  parameters.  Given  N  noisy  measurements  of  vector  aj 
and  scalar  bj,  and  assuming  a  linear  parametric  relationship  between  the  parameters  and 
measurements,  defined  by 

bi  =  a?'z-i-^i,  i=  1,2, -,N  (2.2-1) 

where  represents  an  unknown  additive  noise,  LS  estimation  determines  an  optimal  estimate  of 
z,  z  that  minimizes  an  LS  residual  function  r  (z)  ,  defined  by 
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N 

r(z)  =  S  [bj  -a^z]^ 

i  =  1 


(2.2-2) 


Grouping  N  measurements  {aj }  and  {b; }  via  the  single  matrix  and  vector  expressions 


ai 

bi 

A  = 

I 

b  = 

* 

bw 

the  linear  LS  estimation  can  then  be  written  as  determining  the  estimate  z  that  minimizes 


r(z)  =  lb  -  Azi 


(2.2-4) 


The  SVD  LS  algorithm  is  the  method  of  choice  for  solving  most  linear  LS  estimation 
problems  since  it  provides  numerically  stable  solution  method  that  is  not  susceptible  to  computer 
roundoff  error.  It  also  provides  an  error  covariance  estimate  of  the  LS  solution  itself  (Press, 
Teukolsky,  Vetterling  &  Flannery,  1992). 

The  SVD  method  is  based  on  the  following  theorem  of  linear  algebra:  any  N  x  M  matrix, 
whose  number  of  rows  N  is  greater  than  or  equal  to  its  number  of  columns  M,  can  be  written  as 
the  product  of  an  N  x  M  column-orthogonal  matrix  U,  an  M  x  M  diagonal  matrix  W  with 
positive  or  zero  elements  (the  singular  values),  and  the  transpose  of  an  M  x  M  orthogonal  matrix 
V: 

A  =  (2.2-5a) 


where 

U'^U  =  V'^V  =  =  I,  and  W  =  diag  (wj,  w^)  (2.2-5b) 

As  a  basic  linear  algebra  operation,  the  SVD  method  has  been  implemented  in  various 
computer  languages,  e.g.,  a  C  computer  code  is  given  in  Press  et  al.  (1992). 

Once  the  SVD  of  the  A  matrix  is  obtained,  the  LS  solution  z  of  problem  (2.2-4)  can  then  be 
found  as 

z  =  VW-  'U'^b  (2.2-6) 

The  covariance  P  in  the  estimate  of  2  is  given  by 

P  =  Cov(z)  =  VW~  ^V'^  (2.2-7) 

The  SVD  LS  algorithm  can  also  be  extended  to  solve  the  constrained  linear  LS  estimation 
problem,  i.e.,  to  determine  the  estimate  that  z  minimizes 
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r(z)  =  lb  -  AzI 

subject  to  the  constraint  that 
z^z  =  1 


(2.2-8a) 

(2.2-8b) 


Specifically,  denoting 

y  =  V'^z  and  c  =  U^b  (2.2-9a) 

we  have  the  LS  problem  (2.2-8)  being  transformed  to  determine  the  y  that  minimizes 

r(y)  =  ib  -  Wyl  (2.2-9b) 


subject  to  the  constraint  that 

y'^y  =  1 


(2.2-9C) 


Defining  the  Lagrange  function 

r(y,  ?t)  =  lc-Wyl-i-)L(y’^y-l)  (2.2-10) 

and  noting  that  W  is  a  diagonal  matrix,  we  have  the  parametric  solution  y{X)  to  (2.2-10)  given  by 
w  c 

yi  =  -2^,fori=  1,2,...,M  (2.2-11) 

wp  +  A 

Substituting  (2.2-11)  into  y^y  =  1,  the  Lagrange  multiplier  A,  is  then  constrained  by  the 
following  single  variable  equation: 

i(^^)^=l  (2.2-12) 

i  =  1  wp  -I-  A 

The  equation  has  a  unique  solution  and  can  be  found  using  standard  root-finding  algorithms, 
such  as  Newton' s  method  given  in  Press,  Teukolsky,  Vetterling  &  Flannery  (1992).  After  the 
Lagrange  multiplier  A  is  found,  the  original  parameter  vector  z  and  its  estimated  error  covariance 
can  be  found  via 


z  =  Vy  where  y  =  (  ,  •••,  ^ ^  )^ 

wf  +  A  Wj(]  +  A 


Cov(z)  =  V  diag( 


w^  + A’  ’  w^-hA 


(2.2- 13a) 
(2.2- 13b) 


The  SVD  LS  estimation  algorithm  has  a  computational  complexity  of  0(NM2)  flops.  A  flop 
roughly  constitutes  the  effort  of  doing  a  floating  point  add,  a  floating  point  multiply,  and  some 
indexing. 
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2.2.3  Square-Root  Kalman  Filter  Algorithms 

The  Kalman  filtering  approach  is  a  Bayesian  estimation  technique  used  to  obtain  accurate 
and  robust  estimates  of  stochastic  dynamic  system  states  that  are  observed  with  noisy 
measurements.  The  implementation  of  a  Kalman  filter  requires  specification  of  three 
probabilistic  models:  system,  measurement,  and  prior  estimate  (Anderson  &  Moore,  1979). 

The  system  model  describes  the  system  state  evolution  over  time,  and  is  defined  by  the  state- 
space  transition  equation 

Xk+i  =  AkXk  +  ek,  ek~N(0,Qk)  (2.2-14) 

where  is  the  current  system  state  vector  at  time  k,  A  ^  is  a  known  transition  matrix,  and  is  a 
Gaussian  noise  with  zero  mean  and  a  covariance  Qj.. 

The  measurement  model  relates  the  state  vector  to  the  noisy  measurement  vector  via  a 
measurement  matrix  and  the  addition  of  a  Gaussian  noise  Tik  with  zero  mean  and  a 
covariance  R^.  It  is  defined  by  the  measurement  equation 

yk  =  CkXk-nik,  Tik~N(0,  Rk)  (2.2-15) 

The  prior  estimate  model  describes  the  knowledge  about  the  initial  system  state  Xq  and  its 
covariance  Pg  before  the  first  measurement  is  taken,  and  assumes  uncorrelated  system  and 
measurement  noises.  It  is  defined  by 

Xo  =  E[Xo]  (2.2-16a) 

Po  =  CovE[xo]  and  E[ek'nl]  =  0  (2.2-16b&c) 

We  always  assume  that  there  is  no  initial  knowledge  on  the  system  states  before  the  first 
estimate.  In  other  words,  the  prior  model  for  the  system  states  is  fixed,  and  is  always  given  by 

Xq  =  0,  Pq  =  diag  (oo)  (2.2-  16d) 

Once  the  system,  measurement,  and  prior  models  are  specified,  the  Kalman  filter  algorithm 
operates  in  two  phases,  prediction  and  correction,  as  shown  in  figure  2.2-2  via  the  right  and  left 
blocks  separated  by  a  dashed  line. 
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correction  I  prediction 


Figure  2.2-2:  Kalman  Filter  Block  Diagram 


In  the  prediction  phase,  the  previous  state  estimate  X;,_  j  and  covariance  estimate  Pij_  j  are 
extrapolated  to  predict  the  current  state  xj^  and  via  state  extrapolation  and  covariance 
extrapolation  defined  by 

Xk  =  Ak_iXk_,  (2.2-17) 

Pk  ^^k-lPk-l^k-l +Qk-1  (2.2-18) 

In  the  correction  phase,  the  predicted  covariance  is  used  to  compute  the  Kalman  gain  matrix 
Kk  and  to  update  the  covariance  matrix  Pk  via 

Kk  =  PkCk[CkPkCk +Rk]"^  (2.2-19) 

Pk  =  [I-KkCk]Pk  (2.2-20) 

The  measurement  residual  -  CkXk  is  then  weighted  by  the  gain  matrix  Kk  and  added  to  the 
predicted  state  Xk  to  yield  the  updated  state  estimate  Xk  via 

Xk  =  Xk  +  Kk[yk  -  CkXkl  (2.2-2 1 ) 

The  conventional  Kalman  filter  algorithm  is  defined  by  equations  (2.2-17)  to  (2.2-21). 
However,  it  is  prone  to  serious  numerical  difficulties  that  can  cause  a  loss  of  positive  definiteness 
in  updating  the  covariance  matrix  Pk  leading  to  divergent  state  estimates.  Although  algebraically 
equivalent  to  the  conventional  Kalman  filter  algorithm,  square-root  Kalman  filter  algorithms 
exhibit  improved  numerical  precision  and  stability,  especially  in  those  ill-conditioned  problems 
where  conventional  algorithms  fail. 

Square  root  Kalman  filter  algorithms  are  developed  using  a  square  root  decomposition 
feature  of  symmetric,  positive  semidefinite  matrices — the  class  of  matrices  which  the  covariance 
matrix  Pk  belongs  (Maybeck,  1979).  This  feature  dictates  that  for  each  Pk  there  exists  at  least  one 
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square  root  matrix  such  that 

Pt  =  v^yP^ "  (2.2-22) 

The  basic  idea  of  the  square  root  algorithms  is  to  replace  the  recursive  updates  of  in  both 
the  prediction  and  correction  phases  with  the  updates  of  and  to  update  the  state  estimate 
using  an  optimal  gain  computed  using  instead  of  The  update  of  /P^  guarantees  a 
positive  definite  covariance  matrix  Pj^,  thus  eliminating  the  possibility  of  state  estimate 
divergence.  Various  versions  of  square  root  Kalman  filter  algorithm  implementation  are  given  in 
Maybeck(1979). 
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3.  PASSIVE  Sensor  motion  and  terrain  Awareness  System  (ATAS) 

This  chapter  provides  a  description  of  the  passive  sensor  motion  and  terrain  awareness 
system  (ATAS).  Section  3.1  presents  the  architecture  of  the  ATAS  that  is  composed  of  an  image 
sensor  and  three  processors:  an  image  processor,  a  snapshot  estimator,  and  a  Kalman  filter. 
Section  3.2  describes  the  basis  for  the  flow-based  egomotion  and  terrain  shape  estimation 
problem,  and  derives  the  basic  constraint  relating  unknown  motion  and  terrain  shape  parameters 
to  directly  measurable  image  gradients.  Section  3.3  then  describes  the  image  processor,  section 
3.3  the  snapshot  estimator,  and  section  3.4  the  Kalman  filter. 


3.1  System  Architecture 

Figure  3.1-1  shows  the  architecture  of  the  passive  sensor  altitude  and  terrain  awareness 
system.  The  architecture  is  developed  using  a  Kalman  filter  based  direct  approach,  and  is 
composed  of  an  image  sensor  and  three  processors:  an  image  processor,  a  snapshot  estimator, 
and  a  Kalman  filter,  as  shown  in  figure  3.1-1. 


snapshot  smoothed 

images  measurements  motion  &  shape  motion  &  shape 

estimates  estimates 


B 

RSjl 

IBil 

i 

>L<. 

compute  image 
measurements 

i 

generate  snapshot 
motion  &  shape 
estimates 

Figure  3.1-1;  Passive  Sensor  Altitude  and  Terrain  Awareness  System 


The  image  sensor  provides  sequences  of  real-time  two  dimensional  dynamic  images  of  the 
overflown  terrain.  We  simulated  real-time  imaging  operations  using  CGI  and  pre-recorded 
imagery.  With  CGI,  sequential  image  frames  were  either  fed  directly  to  the  image  processor  or 
were  sent  to  a  video  recorder  for  storage  and  later  use.  With  the  pre-recorded  imagery,  a  real¬ 
time  frame  grabber  operated  on  the  video  signal  to  produce  the  necessary  sequence  of  image 
frames,  which  were  then  fed  direcetly  to  the  image  processor.  The  simulation  of  the  image 
sensor/digitizer  is  described  in  greater  detail  later  in  section  4. 1 . 

The  image  processor  computes  image  measurements  each  time  a  new  image  (frame)  is 
acquired.  These  image  measurements  consist  of  first  order  image  gradients  and  results  from 
arithmetic  operations  between  the  image  gradients  and  the  image  pixel  coordinates.  Based  soley 
on  the  current  image  measurements,  the  snapshot  estimator  then  generates  a  "snapshot" 
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estimate  of  motion  states  and  terrain  shape.  Using  the  snapshot  estimates  as  state  measurements, 
the  Kalman  filter  then  improves  estimation  accuracy  by  integrating  information  over  time  and 
exploiting  the  continuity  constraints  in  the  observer's  motion  states  and  the  terrain  shape. 


3.2  Problem  Formulation  and  Estimation  Constraint  Equation 

Consider  an  observer  undergoing  both  translational  and  rotational  motion  over  motionless 
terrain  as  illustrated  in  figure  3.2-1.  The  observer's  linear  and  angular  velocity  are  denoted  by  the 
vectors  V  and  £2,  respectively.  Let  Pj  be  a  viewed  point  on  the  terrain,  located  at  a  range  pj  from 
the  observer.  In  the  observer's  frame  of  reference,  the  unit-length  line-of-sight  (LOS)  direction 
vector  uj  =  pj/Ipji  will  appear  to  change  with  time,  at  a  rate  given  by 

Uj  =  (OiXUi  (3.2-la) 


where  cOj  is  the  rotation  rate  of  the  LOS  vector  (effectively,  the  3D  visual  flow  seen  by  the 
observer),  and  is  shown  by  Zacharias  and  Levison  (1980)  to  be  given  as 


tOj  =  Uj  X  (Uj  X  £2)  -  --  (uj  X  Uy) 


(3.2-lb) 


where  Uy  is  the  observer's  heading  vector,  and  tj  is  the  impact  time,  defined  by 

-  lYi  »  Xj  -  1^1 


(3.2-2a,b) 


The  unit  length  heading  vector  defines  only  the  direction  of  the  observer's  motion,  but  not  his 
speed.  The  impact  time  is  the  elapsed  time  before  the  observer  impacts  with  the  surface  at  point 
Pj,  //the  observer  were  to  head  directly  at  Pj^  at  the  speed  IVI. 


Angular  Velocity  Q 
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We  now  develop  a  model  for  estimating  egomotion  and  terrain  shape  based  on  the  image 
gradient  information.  The  model  employs  two  additional  key  relationships  that  relate  the  directly 
observable  image  intensities  captured  on  a  hypothesized  image  plane,  to  the  unknown 
motion/terrain  variables. 


The  first  key  relationship  relates  the  3D  visual  flow  to  its  2D  projection  on  a  hypothesized 
image  plane.  Consider  the  2D  image  of  the  terrain  formed  on  an  image  plane  within  the 
observer's  field-of-view  (FOV),  as  illustrated  in  figure  3.2-2.  Let  the  focal  point  be  the  origin  of  a 
screen  coordinate  system.  Let  Us  denote  the  unit  length  vector  normal  to  the  sensor  plane,  and  ds 
denote  the  effective  focal  length.  The  image  of  point  Pj,  given  by  Pj,  located  by  the  vector  dj  in 
the  image  plane,  is  defined  by  the  focal-length  normalized  x-y  coordinates  5}  =di/ds  =  (xj,  yi,  1)T. 
As  shown  by  Zacharias,  Miao  &  Riley  (1992),  the  rotation  rate  of  the  LOS  vector  Uj  generates 
the  following  rate  of  change  of  this  in-plane  location  vector  Sp 


03; 

)i  =  - - ^  = 

(Ui»Us) 


(3.2-3a) 


where  =  t j  /I5jl  =  Xj  /  y  1  +  x^  +  y^  is  the  plane  normal  (screen  coordinate  system)  impact 
time,  Sj  is  a  two  dimensional  vector  in  the  screen  x-y  plane,  and 


^  Xiyj  -(x2+l)  yj 
^  (y2  +  1)  -x,yi  -Xi 


(3.2-3b) 


(3.2-3C) 


Observer 


Figure  3.2-2:  Image  Plane  Geometry 
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The  second  key  relationship  relates  3D  visual  flow  to  the  instantaneous  spatial  and  temporal 
image  gradients  in  the  3D  image  plane.  Consider  how  the  intensity  Ij  of  image  point  Pj  changes 
with  time  at  an  in-plane  location  6j.  Using  the  partial  derivative  rule,  we  obtain  the  following 
expression  for  the  total  temporal  derivative  of  image  intensity; 

+  lit 

dt  dt  ‘y  dt  " 


where  bx  =  aii/9x  and  hy  =  dl^/dy,  and  we  have  assumed  the  image  brightness  function  is  first  order 
continuous  at  the  image  point  5,.  Assuming  that  the  brightness  of  any  terrain  point  is  constant 
over  time,  we  require  the  total  derivative  of  (3.2-4)  to  be  zero,  so  that 


li 


IX 


IX’  ^ly 


)Si 


(3.2-5) 


where  S;  is  given  by  eq.  (3.2-3a).  Substituting  (3.2-3a)  into  (3.2-5),  for  each  pixel  i,  we  obtain  an 
estimation  constraint  equation  that  relates  the  directly  measurable  image  intensity  Ij  (through  its 
gradients  lit,  liy)^  with  the  observer  motion  states  Q,  Uy,  and  terrain  impact  time  Xj  as  follows: 


=  + 


Xj^  "iv“v 


(3.2-6a) 


where 


hi£2 


and  hjy  = 


lix^ 

hyj 


(3.2-6b) 


3.3  Image  Processor 

The  image  processor  computes  the  image  measurements  {hij^^hiv,  lit)  each  time  a  new  image 
is  acquired.  The  processor  boils  down  to  computing  the  image  gradients,  since  computing  {hjn, 
hiv}  via  (3.2-3b)  involves  only  arithmetic  operations  between  image  gradients  and  image  pixel 
coordinates.  We  use  a  direct  numerical  differentiation  method  to  compute  the  image  gradients. 
Specifically,  Let  k  be  the  current  time  (image  frame)  index,  and  At  be  the  frame  interval.  At  the 
current  time  k,  we  have  image  k  defined  by  its  intensity  map  {IfXj,  yj,  t)}  ,  where  index  i  denotes 

the  ith  pixel  in  the  image.  The  image  gradients  are  computed  for  each  pixel  using  the  following 
two-point  central  difference  formulas: 

I(Xj+Ax,  yj,  t  -  At)  -  I(Xj-  Ax,  yi,  t  -  At) 

2Ax 

I(Xi,  yj  -I-  Ay,  t  -  At)  -  I(Xj,  y;  -  Ay,  t  -  At) 

^•y""  2Ay 
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where  At  is  the  temporal  sampling  time,  and  Ax  and  Ay  are  the  x  and  y  pixel  dimensions.  The 
two-point  central  formula  provides  higher  accuracy  than  the  standard  two-point  backward 
formula  (OfAx)^  versus  Of  Ax)),  at  the  cost  of  storing  one  additional  old  image  frame.  The  cost  of 
this  is  relatively  low.  Moreover,  since  the  temporal  and  spatial  gradients  are  estimated  at  the 
same  (delayed)  point  in  time,  no  phase  shift  is  introduced.  After  Ijx  and  liy  are  computed,  the 
measurements  hjQ  and  hjy  are  computed  using  (3.2-3b). 


Three  major  processes  contribute  to  image  measurement  (gradient)  errors:  the  spatial, 
temporal,  and  intensity  quantizations  performed  in  generating  the  digitized  image  frames  and  in 
computing  the  gradients.  Assuming  that  the  errors  induced  by  the  three  types  of  quantization  are 
independent,  it  can  be  shown  that  the  average  error  A  and  variance  in  the  measurements  are  as 


follows: 


^(lix)  =  - -^1x3  (x„yi) 


g  ^y3  (^jj  yj) 

2.1  _  Af [i;;  (X,,  yj)]^  1 

108  ■'24G2a,2 


Ar[Iy4(x.,y.)]^  1 

^  108  ^  24G2a2 


O  n  =  I 


C§l"3-nyn  (Xi,yi)x?  "yf 


(At)^[Z^CSl';;-nyn  (Xi,  yi)x^"yf  ]2  ^ 

108  24G2(At)2 


(3.3-2) 


where  Aj  =  Aj^  =  Ajy  is  the  pixel  size,  C"  is  the  binomial-coefficient,  and  G  is  the  image  gray 
level.  The  derivation  of  the  formulas  are  provided  in  Appendix  A. 


The  average  error  and  variance  formulas  (3.3-2)  indicate  that  the  average  errors  and  error 
variances  are  in  proportion  to  the  high-order  image  gradients.  Consequently,  if  the  image 
intensity  function  is  discontinuous  at  higher  order,  a  larger  error  will  occur.  In  such  a  situation,  a 
low-pass  filter  is  needed  to  smooth  the  images  before  the  gradient  can  be  computed  using  the 
standard  differentiation  formula. 
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The  application  of  a  low-pass  filter  should  not  be  abused,  however,  since  the  low-pass 
filtering  also  reduces  the  absolute  value  of  image  gradients.  Consequently,  the  image  signal-to- 
noise  ratio  may  be  reduced  instead  of  enhanced  through  low-pass  filtering.  In  addition,  low-pass 
filtering  adds  computational  expense  to  image  measurement  computation. 

The  complete  image  processor  algorithm  is  given  as  follows: 

Algorithm:  Two  Point  Center  Difference 
For  i  =  1 . N 

Step  1:  Compute  image  gradients  Ij^,  Ijy,  and  from  (3.3-1) 

Step  2:  Compute  the  image  measurements  hj^  and  h,y  from  (3.2-3b) _ _ _ _ _ 

Computation  of  image  measurements  using  this  algorithm  requires  13N  flops,  where  3N 
flops  are  for  gradient  computation,  and  ION  flops  are  for  image  measurement  computation.  A 
flop  roughly  constitutes  the  effort  of  doing  a  floating  point  add,  a  floating  point  multiply,  and 
some  indexing. 

Image  measurement  computation  can  be  done  in  parallel  since  computation  for  each  pixel 
can  be  performed  independently.  Consequently,  if  a  P  processor  parallel  computer  is  available 
for  tjie  computation,  a  computation  reduction  of  P  times  can  then  be  achieved. 


3.4  Snapshot  Estimator 

The  objective  of  the  snapshot  estimator  is  to  determine,  each  time  an  image  is  acquired,  the 
snapshot  (instantaneous)  motion  and  terrain  estimates  Cl,  Uy,and  which  minimize 


r(Q,  Uv,{Xiz})  =  X 

i  =  1 


(hTQO  + 


(3.4-1) 


given  N  measurements  {hiQ,hiv,  lit}-  Direct  inspection  of  (3.4-1),  however,  reveals  that 
satisfaction  of  (3.4-1)  alone  cannot  provide  a  unique  solution  to  the  snapshot  estimation  problem 
since  there  are  N  -F  5  unknown  motion  and  terrain  shape  variables  to  be  found  from  N 
constraints.  We  resolve  this  "under-constraint"  difficulty  by  requiring  that  the  terrain-impact  time 
map  {Xiz}  satisfy  additional  terrain  structural  constraints. 


This  can  be  done  if  we  impose  on  the  terrain  impact  time  map  {Xja}  a  parametric  terrain 
structure  constraint  specified  by  M  structural  parameters,  where  M  <  N-5.  Subject  to  this 
additional  structural  constraint,  the  snapshot  estimation  problem  then  becomes  one  of  solving 
(M-i-5)  motion  and  terrain  shape  variables  from  N  overconstrained  equations,  where  the  excess 
constraint  equations  are  used  to  improve  the  accuracy  of  the  solution. 


25 


R92191 


Charles  River  Analytics 


Two  types  of  parametric  terrain  structure  constraints  are  considered.  The  first  type  explicitly 
requires  that  the  terrain  impact  time  map  {Xiz}  satisfy  a  polynomial  geometry  model  such  as  flat 
surface,  quadratic  surface,  etc.  The  second  type  does  not  assume  a  priori  a  surface  surface  model, 
but  requires  that  the  terrain  impact  time  map  { Xjz }  satisfy  a  pre-determined  impact-time  (depth) 
layer  structure.  Section  3.4.1  describes  the  geometry  based  snapshot  estimator,  while  section 
3.4.2  describes  the  structure  based  snapshot  estimator. 


3.4.1  Geometry  Based  Snapshot  Estimator 


Figure  3.4-1  illustrates  the  (polynomial)  geometry-based  approach  to  snapshot  estimation. 
Assuming  that  the  terrain  surface  can  be  modeled  by  a  polynomial  geometry  function,  the 
approach  determines  an  LS  solution  to  the  model  parameters  and  motion  state  parameters. 


measurements 


snapshot  terrain 
estimates 


snapshot  observer 
state  estimate 


Figure  3.4- 1:  Geometry  Baed  Approach  to  Snapshot  Estimation 


We  begin  by  denoting  a  terrain  point  Pj  by  its  impact  time  vector 


-Zl 

IVI 


~  IVI  ^^ix’  "^iv> 


(3.4-2) 


relative  to  an  observer-fixed  coordinate  system.  Under  perspective  projection,  the  surface  point 
Pi  projects  to  a  point  in  the  normalized  image  plane 

(Xi,  y-f  =  (tix/Xj^,  tiy/Ti^)T  (3.4-3) 

We  model  the  inverse  of  the  z-axis  terrain  impact  time  tiz  as  an  n-th  order  polynomial  of  the 
in-plane  image  coordinates  (xi,  yj),  given  by 

(3-4-4a) 

•'iz 

where 
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c  =  (cq,  C2,  Cn2+3n/2)^  (3.4-4b) 

fj  =  (1,  Xj,  Yi,  x",  xn-‘ui,  (3.4-4c) 

where  c  is  the  model  parameter  vector  of  dimension  M  =  n^  +  3n/3,  and  fj  is  the  image  coordinate 
polynomial  function.  We  choose  the  inverse  impact  time  model  (3.4-4a)  because  the  impact  time 
appears  in  the  estimation  constraint  by  its  inverse.  Transformation  of  the  terrain  model  from  the 
inverse  impact  time  format  to  normal  impact  time  format,  and  from  image  coordinate 
representation  (xi,  yO  to  the  terrain  coordinate  representation  (Xjx,  Xiy)  can  be  found  in 
Negahdaripour  &  Lee,  (1992). 

Requiring  that  the  terrain  points  (impact  times)  satisfy  the  geometry  model  (3.4-4a)  means 
that  (3.4-4a)  can  be  substituted  into  (3.4-1).  After  substitution,  the  simplified  snapshot  estimation 
problem  becomes  one  of  determining  the  c,  Q,  and  Uy  which  minimize; 

r(c,  n,  Uy)  ^  i:  I  I.t  -  hJ^Q.  -  cTf.hT  uy  f  (3.4-5) 

i  =  1 

across  ail  feasible  model  parameters  c,  the  angular  velocity  Q,  and  heading  vector  Uy. 

The  residual  function  r(c,  Q,  Uy)  of  (3.4-5)  is  a  bilinear  function  of  the  model  parameter 
vector  c,  angular  velocity  L>,  and  heading  vector  Uy.  In  other  words,  if  either  (c,  f2)  or  Uy  were 
known,  estimation  of  Uy  or  (c,  Q.)  would  be  an  easily  solvable  linear  LS  minimization  problem. 
We  employ  a  two  stage  integration  approach  to  estimate  (c,  Q)  and  Uy. 

Specifically,  given  an  estimate  of  the  heading  vector  Uy,  the  snapshot  terrain  shape  estimation 
problem  of  (3.4-1)  becomes  a  linear  LS  estimation  problem  to  determine  Qandc  which 
minimize 


r(n,  c)  =  llj  - 


(uj'hiy)fl^ 


(3.4-6) 


The  snapshot  estimates  Cl  and  c  and  their  covariances  and  P^,  can  then  be  found  using  the 
SVD  LS  algorithm,  presented  earlier  in  section  2.2. 

The  second  stage  assumes  angular  velocity  and  model  coefficient  vector  estimates  (i2,  c),  so 
that  the  snapshot  terrain  shape  estimation  problem  of  (3.4-1)  becomes  a  constrained  linear  LS 
estimation  problem  to  determine 
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fly  which  minimizes 


r(Uy)  =  lbj- 


subject  to  ujuy  =  1 


(3.4-7a) 


where 


=  (3.4-7b) 

The  snapshot  estimate  fly  and  its  covariance  Py  can  then  be  found  using  constrained  SVD  LS 
algorithm,  presented  earlier  in  section  2.2. 

Figure  3.4-2  summarizes  the  block  diagram  of  this  two  stage  integration  approach.  Starting 
with  an  initial  guess  on  Uy,  the  snapshot  terrain  and  angular  velocity  estimator  generates  an 
estimate  of  the  terrain  model  parameters  c  and  Q,  which  in  turn  drive  the  snapshot  heading 
estimator  to  update  the  estimate  on  Uy.  The  updated  heading  estimate  Uy  is  then  fed  back  to  the 
snapshot  terrain  and  angular  velocity  estimator  to  generate  a  new  estimate  of  the  terrain  model 
parameters  and  the  angular  velocity.  The  iteration  proceeds  until  a  convergence  criterion  on 
residual  function  reduction  is  satisfied.  After  the  convergence,  the  impact  time  map  estimate 
and  its  variance  can  then  be  determined  via 

t-  -  ^ 
bz  - 

f  Jp  f. 

<^(^iz)  =  (3.4-8a,  b) 

^iz 


terrain  shape  & 
angular  velocity 
estimates 
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Note  that  a  pure  guess  on  Uy  is  needed  only  for  the  first  snapshot  estimate.  The  snapshot 
estimate  at  any  other  later  time  (frame)  can  always  use  the  value  of  the  previous  Uy  as  a  starting 
estimate  since  the  observer  motion  states,  in  general,  do  not  change  much  from  one  image  frame 
to  another. 

The  complete  geometry  model  based  snapshot  estimation  algorithm  is  given  as  follows: 

Algorithm:  Geometry-BSsed  Snapshot  Estimator 

1)  Initiate 

If  it  is  the  first  time  the  algorithm  is  called,  select  an  initial  (unit  length)  Uy 

2)  Continue  snapshot  estimation  until  residue  function  r  converges 

Call  SVD  LS  algorithm  to  estimate  model  parameters  C  and  angular  velocity  Q 

Call  constrained  SVD  LS  algorithm  to  estimate  heading  vector  Uy  _ _ 

Terrain  and  angular  velocity  estimation  complexity  is  of  the  order  (M+3)2N  flops,  while 
heading  estimation  complexity  is  of  the  order  9N  flops.  If  the  geometry  based  snapshot  estimator 
takes  L  iterations  to  converge,  then  overall  computational  complexity  is  of  the  order 
L((M+3)2+9)N  flops. 


3.4.2  Structure-Based  Snapshot  Estimator 

The  structure-based  snapshot  estimator  models  the  terrain  surface  as  multiple  depth  layers, 
each  containing  terrain  points  of  the  same  depth  (impact  time).  Given  the  number  of  layers 
allowed  to  define  a  terrain  surface,  the  estimator  first  optimally  assigns  terrain  points  among 
layers  (or  determines  the  optimal  terrain  structure)  before  making  the  motion  and  terrain  shape 
estimates. 


We  use  a  structure  of  M  equal  distance  layers  to  define  a  terrain,  where  S  =  { 1  A_  2A,  ...,  mA, 
...,  MA}  and  A  is  a  given  constant  defining  the  relative  distance  between  layers.  Note  that  the 
structure  is  defined  independent  of  the  coordinate  system  and  in  relative  distance  terms. 
Requiring  that  each  terrain  point  be  assigned  to  one  of  the  M  layers,  we  formulate  the  LS 
snapshot  estimation  problem  to  minimize  the  following  residual  function  r({aj},  Q,  y)  across  all 
feasible  assignments  of  terrain  points  (aj),  all  candidate  angular  velocities  Q,  and  all  candidate 
motion-scalar  variables  y  =  Uy/x 


r({ai},  Q.,  y)  = 


L. 


|2 


aihiv 
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(3.4-9) 


where  the  (Nx6)  matrix  [H^,  Hy]  is  determined  by  the  image  measurements  and  the  terrain 
point  assignment  policy.  The  only  difference  between  (3.4-9)  and  the  original  snapshot 
estimation  problem  (3.4-1)  is  that  in  (3.4-1)  the  impact  times  can  be  any  real  number,  but  in  (3.4- 
8)  they  are  limited  to  a  discrete  set  of  values  defined  by  the  terrain  structure  S. 


Notice  that  the  angular  velocity  measurement  matrix  Hq  does  not  depend  on  the  terrain  point 
assignments.  The  LS  solution  to  Q  is  given  by 

0  =  Vj^diag(l/^„  l/l^,  l/X3)U^(It-Hyy)  (3.4-10) 


where  Vq,  diag  (Xj,  A-2,  X3),  and  Uq  are  the  SVD  of  Hq.  Substituting  (3.4-10)  into  (3.4-9),  the 
problem  becomes  one  of  minimizing  the  following  residual  function  r({aj},  y)  across  all  feasible 
assignments  of  terrain  points  and  all  candidate  motion-scalar  variables  y: 

r(  { a; } ,  y)  =  I  (I  -  J)!^  -  (I  -  UnU^)Hyy  f  (3.4- 1 1) 


Approximating  the  projection  matrix  by  its  diagonal  components,  the  snapshot 

estimation  problem  (3.4-1 1)  reduces  to  determining  y  and  {aj}  which  minimize 


r(y.  {ai})  = 


r  . 

: 

2 

b. 

- 

aih[ 

y 

[  • 

• 

=  lb  -  Hyp 

where  b;  =  diagi(I  -  and  hj  =  diagi(I  -  U^U^lhiy. 


(3.4-12) 


Figure  3.4-4  illustrates  basic  ideas  to  convert  problem  (3.4-12)  into  a  pure  structure 
extraction  problem.  Given  a  terrain  structure,  the  terrain  and  heading  estimation  problem  (3.4-12) 
can  be  transformed  into  two  sub-problems  of:  1)  finding  the  optimal  terrain  structure,  and  2) 
determining  the  optimal  motion-scalar  variable,  where  the  second  problem  can  be  solved  in 
parametric  terms  as  a  function  of  the  first  problem.  Removing  the  motion-scalar  variable  via  the 
parametric  representation,  we  reach  a  pure  structure  extraction  problem  that  finds  the  optimal 
terrain  structure  without  knowing  either  the  motion  states  or  the  terrain  impact  time  (depth). 
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Figure  3.4-4:  Problem  Formulation  of  Structure  Extraction 


Specifically,  for  any  given  terrain  point  assignment  policy  {aj},  the  LS  solution  to  the 
motion-scalar  variable  y  is  given  by 

(3.4-13) 

Substituting  (3.4-13)  back  into  (3.4-12)  to  remove  the  dependence  on  y,  problem  (3.4-12) 
becomes  one  of  assigning  terrain  points  among  the  structure  layers  to  minimize  the  residual 

r({ai})  =  b^b-  b'^H[H'^H]- 'H'^b  (3.4-14) 

This  problem  is  equivalent  to  maximizing  the  overall  projection  length  b^H[H^H]“ ’H^b, 
since  b  b  is  not  affected  by  the  assignment  policy  {aj}.  Problem  (3.4-14)  is  thus  a  pure  structure 
extraction  problem  since  it  depends  neither  on  the  heading  vector  nor  the  actual  terrain  impact 
times  (depth).  Its  solution  provides  only  a  relative  description  of  the  terrain  structure  (which 
terrain  point  is  at  which  layer),  but  not  its  actual  impact  time  (depth).  The  significance  of  the 
formulation  is  that  it  breaks  the  chicken-and-egg  circle  between  structure  extraction  and 
state/depth  estimation  by  using  a  relative  structural  representation. 

We  now  develop  an  iterative  algorithm  that  optimally  assigns  N  terrain  points  among  M 
structure  layers,  one  point  at  each  iteration.  Let  n  be  the  current  iteration,  a"  be  the  point  i's 
current  assignment,  and  e({a['})  be  the  current  projection  length.  At  iteration  n+l,  we  would 
change  the  point  i's  assignment  to  af  ^  =  1,2, ...,  M.  Define 
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li"  =  ’h, 

=  (3.4-15a,b) 

Appendix  B  shows  that  the  optimal  assignment  a"'*'*  is  the  solution  to  the  following  quadratic 
function 

[aftinbj  -  Yf]|ifbi(a,"  +  >  -  af)^  +  [pnbf  -  (Yf)2](aJ'  + '  -  af)  +  [b^  -  af-iflyn  =  0  (3.4-16) 

when  a"”^'  can  be  any  value.  In  other  words,  the  optimal  assignment  a”  ’  can  be  determined  by 
comparing  the  projected  length  reduction  between  only  two  assignments  with  a  few  arithmetic 
operations,  independent  of  the  number  of  layers  M  which  is  usually  large.  The  optimal 
assignment  is  then  a  layer  that  belongs  to  the  structure  S,  and  is  closest  to  the  largest  projection 
length  reduction  assignment . 

Note  that  the  optimal  assignment  at  each  iteration  always  increases  the  overall  projection 
length  when  the  assignment  is  different  from  the  previous  one.  On  the  other  hand,  the  projection 
length  increment  is  upper  bounded  by  b|bt  and  cannot  increase  indefinitely.  Consequently,  the 
algorithm  will  converge  to  the  optimal  solution  through  iterations. 

After  the  structure  extraction,  the  snapshot  motion-scalar  variable  estimate  y  (and  the 
heading  vector  estimate  fly  =  y  /  lyl)  and  angular  velocity  estimate  O,  as  well  as  their  covariance 
matrices,  can  then  be  determined  using  the  SVD  LS  algorithm  to  solve  the  linear  LS  problem 
(3.4-9)  for  the  given  terrain  point  assignment  policy  {2;}.  Finally,  the  snapshot  terrain  impact 
time  estimates  }  and  their  variances  {of  }can  be  computed  by 

tiz  =  lyl/ai  anda?  =  |yph[Pyhi  (3.4-17) 

Figure  3.4-3  illustrates  the  three  module  architecture  of  the  structure  based  snapshot 
estimator:  a  structure  extractor,  an  egomotion  estimator,  and  an  impact  time  map  estimator.  The 
structure  extractor  takes  the  image  measurements  from  the  image  processor,  and  determines  the 
optimal  terrain  structure,  i.e.,  which  terrain  point  belongs  to  which  layer.  Knowing  the  terrain 
structure,  the  egomotion  estimator  and  impact  time  map  estimator  then  generate,  in  parallel,  the 
LS  snapshot  motion  states  and  impact  time  map  estimation. 
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Figure  3.4-3:  Structure  Extraction  Based  Snapshot  Estimator 


The  complete  structure  extraction  algorithm  is  summarized  as  follows: 
Algorithm:  Structure  Based  Snapshot  Estimator 


1)  Initiate 

Call  SVD  algorithm  to  compute  the  SVD  of  to  obtain  matrix  H  and  vector  bt 
Assign  all  the  terrain  points  to  layer  M/2 
Compute  the  matrix  ^ 

2)  Make  iterative  point  assignment 

Call  Point  Assignment  algorithm  to  determine  structure  assignments  { Sj } 

3)  Estimate  motion  state  and  impact  time  map 

Call  SVD  LS  algorithm  to  solve  (3.4-8)  for  heading  vector  u^,  angular  velocity  O  and  their  covariances 
Generate  impact  time  map  {T|}  estimates  via  (3.4-16) 


Algorithm:  Point  Assignment 


For  n  =  1  to  N 

Compute  coefficients  and  yf  from  (3.4-1 5a,  b) 

Determine  the  optimal  assignments  between  two  assignments 

Update  —  (a-^)^)hjhj]~  ^  using  Sherman-Morrison  Formula  given  in  Appendix  B 


In  actual  implementation,  we  have  employed  a  square-root  implementation  of  the  Sherman- 
Morrison  formula  to  overcome  the  numerical  deficiency  of  the  formula.  The  structure  based 
snapshot  estimator  has  a  computational  complexity  of  the  order  9LN  flops,  where  L  is  the 
convergence  iteration  count,  N  is  the  number  of  terrain  points. 
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3.5  Kalman  Filter 

Figure  3.5-1  shows  the  architecture  of  the  motion  and  terrain  Kalman  filter.  Using  the 
snapshot  state  and  terrain  impact  time  estimates  as  input  measurements,  we  employ  a  motion 
state  Kalman  filter  and  an  impact  time  map  Kalman  filter  to  incrementally  improve  the  motion 
and  terrain  shape  estimates.  Note  that  the  impact  time  map  Kalman  filter  needs  the  motion  state 
estimates  as  its  inputs  since  impact  time  evolution  is  specified  by  the  observer  motion  as 
explained  below  in  section  3.5.2. 


Figure  3.5-1:  Kalman  Filter 


The  Kalman  filter  is  a  Bayesian  estimation  technique  used  to  estimate  stochastic  dynamic 
system  states  using  noisy  observations  or  measurements.  Implementation  of  a  Kalman  filter 
requires  specification  of  three  probabilistic  models:  the  system  model,  the  measurement  model, 
and  the  prior  model.  The  system  model  describes  the  evolution  of  system  state  over  time.  The 
measurement  model  relates  the  system  state  to  the  observed  measurements.  The  prior  model 
describes  the  initial  knowledge  about  the  system  state  and  its  variance  before  the  first 
measurement  is  taken.  Once  the  three  models  are  specified,  the  Kalman  filter  can  then  be 
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implemented  using  the  square  root  Kalman  filter  algorithm,  presented  earlier  in  section  2.3. 

Model  complexity  is  primarily  determined  by  the  selection  of  a  coordinate  system:  the  proper 
system  can  often  transform  an  apparently  intractable  filtering  problem  into  one  that  is  readily 
solvable.  For  this  problem,  we  choose  an  observer-fixed  coordinate  system  (i.e.,  the  body 
coordinate  system)  since  it  leads  to  linear  dynamics  in  both  the  motion  and  impact  time  system 
models.  Once  the  estimates  are  determined  in  the  observer-fixed  coordinate  system,  they  can 
then  be  transformed  to  another  coordinate  system  via  conventional  coordinate  system  transforms. 
Appendix  C  gives  details  of  such  transforms. 

Section  3.5.1  describes  the  development  of  the  motion  state  Kalman  filter,  while  section  3.5.2 
describes  the  development  of  the  impact  time  (depth)  map  Kalman  filter. 


3.5.1  Motion  State  Kalman  Filter 

We  begin  by  deriving  the  dynamics  equations  for  the  observer's  translational  and  rotational 
motions.  Based  on  these  dynamics  equations,  we  then  specify  the  system  and  measurement 
models  for  the  motion  state  Kalman  filter.  The  specification  of  the  prior  models  is  done 
implicitly  by  assuming  zero  knowledge  about  state  variables  before  the  first  measurement. 

Let  k  be  the  current  time  (frame)  index  and  represent  the  position  of  the  observer  at  k.  For 
a  small  frame  time  At,  assuming  the  constant  translational  velocity  between  frames  we  have  the 
translational  dynamics  equations 

=  Rk-  1  +  ^k- 

(3.5-la,  b) 

Remember  that  we  have  chosen  an  observer-fixed  coordinate  system.  In  this  coordinate 
system,  R]^  =  0  since  the  current  position  of  the  observer  is  always  the  origin  of  the  coordinate 
system.  Consequently,  the  selection  of  the  observer-fixed  coordinate  system  reduces  the  system 
dynamics  equations  to  (3.5- lb). 

The  dynamics  equation  (3.5- lb)  can  also  be  written  as 

Vk  =  Vk_, 

Vk  =  ¥k-i 

yk  =  Yk-i  (3.5-2a,  b,  &c) 

where  =  Vj(^(cos  sin  sin  V|/]^  sin  cos  yi^-)"^ 

The  snapshot  estimator  provides  measurements  or  observations  only  on  the  heading  vector 
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fly  which  is  a  unit  vector  defined  by 

®v,  k  =  tj/]^  sin  sin  \j/k  sin  'Vk,  cos  7k) ^ 


(3.5-3) 


where  \|/k  is  the  observer's  heading  angle  and  is  the  observer's  flight  path  angle  Yk-  From 
the  snapshot  estimate  y,  the  heading  angle  estimate  \j/  and  the  flight  path  angle  estimate  7k  can 


be  computed  directly  via 


\j/]^  =  tan 


k(2)^ 

,fiv.k(l)j 


7k  =  cos->  (flv,k(3)) 


(3.5-4a,  b) 


Consequently,  we  define  a  new  state  variable — heading  angle  vector  =  (\|/|^,  yj;.)^’  and 
specify  the  translational  system  and  measurement  models  as 

^  =  +  (3.5-5a,b) 

where  zero-mean  Gaussian  random  variables  and  iii^.  account  for  the  unmodeled  dynamics  and 
observation  errors,  respectively. 

We  now  derive  the  dynamics  equations  for  the  observer's  rotational  motion.  There  are  many 
different  ways  to  describe  an  observer's  rotational  motion  (e.g.  the  nine-parameter  expression,  the 
quaternion  expression,  the  Rodrigures  expression,  and  the  conventional  Euler  angle  roll-pitch- 
yaw  expression).  Each  expression  leads  to  a  different  system  model.  We  select  Eulerangle  roll- 

pitch-yaw  expression  that  uses  three  consecutive  rotations  around  three  principle  axes  to 
characterize  an  arbitrary  rotation.  Let  0  =  (B^,  0y,  G,,)^  be  the  three  Euler  angles — yaw,  pitch, 

and  roll  angles  defining  the  rotations  around  the  three  axes  (X,  Y,  Z)  of  a  given  coordinate 
system.  The  corresponding  rotation  matrix  T  is  given  as  follows: 

T  =  RPY(0)  =  rot(02)  rot(0y)  rotCB^)  (3.5-6a) 


where 


rot(0x) 


eos  0x 
-sin  0x 
0 


sin  0^ 
cos  0x 
0 


0 

0 

1 


rot(0y)  = 


cos  0y 
0 

sin  0y 


0  -sin  0y 

1  0 

0  cos  0y 
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rotCS^)  = 


1  0  0 

0  cos  02  sin  ©2 

0  -sin  02  cos  02 


(3.5-6b,  c,  &d) 


Let  the  matrix  represent  the  observer's  current  orientation  specified  using  the  roll-pitch- 
yaw  expression.  For  a  small  frame  time  At,  assuming  the  constant  rotation  angles  between 
frames,  we  have  the  following  rotation  dynamics  equations: 

Tk  =  Tk_,RPY(0k_i) 

RPY(0k)  =  RPY(0k_,)  (3.5-7a,b) 


Remember  again  that  we  have  chosen  an  observer-fixed  coordinate  system.  In  this  coordinate 
system,  Tj^  =  I  since  the  current  orientation  of  the  observer  defines  the  observer's  body 
coordination  system.  Consequently,  the  selection  of  the  observer-fixed  coordinate  system  reduces 
the  dynamics  equations  to  (3.5-7b). 

Although  the  dynamics  equation  (3.5-7b)  completely  define  the  rotational  motion,  it  is  not  a 
system  model  since  the  nine-element  rotation  matrix  RPY (01;.)  is  uniquely  specified  by  the  three- 
variable  rotation  angles 

0k  =  (3.5-8) 

Let  RPY  '  be  the  inverse  operator  of  RPY.  That  is,  RPY-'  takes  a  rotation  matrix  as  the 
operand  and  returns  its  Euler  angles.  Applying  RPY-'  on  both  sides  of  (3.5-7b)  and  canceling  the 
constant  At,  we  have  the  rotation  system  model,  given  by 

=  (3.5-9a) 

where  zero-mean  Gaussian  random  variables  accounts  for  the  unmodeled  dynamics.  We 
specify  the  rotation  measurement  model  as 

Ac  =  ^c  +  Ck  (3.5-9b) 

where  zero-mean  Gaussian  random  variables  account  for  the  observation  errors. 

Using  the  snapshot  angular  velocity  and  heading  angle  estimates  as  the  measurements,  the 
translational  and  rotational  system  models  of  (3.5-5a)  and  (3.5-9a),  and  measurement  models  o  of 
(3.5-5b)  and  (3.5-5c),  we  can  then  directly  use  the  square  root  Kalman  filter  algorithm  presented 
earlier  in  section  2.2.3  to  generate  the  smoothed  motion  state  estimates. 


3.5.2  Terrain  Impact  Time  Map  Kalman  Filter 

The  impact  time  map  Kalman  filter  generates  the  smoothed  impact  time  estimates  using  the 
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snapshot  estimates  of  the  impact  time  as  measurements.  We  adopt  a  pixel-based  (iconic) 
approach  to  the  Kalman  filter  implementation,  where  the  impact  time  at  a  pixel  is  defined  as  the 
state  variable. 

Remember  that  we  have  chosen  the  body  coordinate  system  for  the  system  dynamics 
description.  In  this  coordinate  system,  assuming  a  rigid  terrain  surface,  the  system  model  for  the 
ith  pixel  impact  time  (state  variable)  can  be  expressed  as 

'Ci.k  =  ^i,k-i+ek-i  (3.5-10) 

where  both  Xj  and  Xi^k_i  are  represented  in  the  observer-fixed  coordinate  system  at  time 
(frame)  k,  and  £i^_i  is  a  zero-mean  Gaussian  noise  accounting  for  the  model  error.  The 
measurement  model  is  given  by 

\k  =  \k  +  ^k  (3.5-11) 

where  represents  the  measurement  error. 

The  snapshot  estimator  provides  and  its  variance  CTj  Given  the  system  and 

measurement  models,  the  implementation  of  the  Kalman  filter  for  impact  time  smoothing  is 
straight  forward  except  the  computation  of  estimate  in  the  prediction  phase  according  to  (2.2- 
17).  The  available  estimate  j  is  made  in  the  body  coordinate  system  at  time  k-1  while  the 
computation  of  needs  the  estimate  made  in  the  body  coordinate  system  at  time  k. 

We  develop  a  numerical  extrapolation  approach  to  generating  the  needed  estimate  of  xj"]^  at 
time  k,  using  the  available  impact  time  map  estimates  {Xj  j )  at  k-1  and  the  known  observer 
motion  states  ]  and  fly, k-  i-  Figure  3.5-2  illustrates  the  impact  time  evolution  between  two 
consecutive  frames.  The  evolution  is  determined  by  two  key  factors:  relative  motion  of  the 
observer  with  respect  to  the  terrain  and  a  finite  field-of-view  (FOV)  limiting  the  observer's 
instantaneous  view  of  the  terrain.  Specifically,  the  relative  motion  implies  that  will  change 

from  one  image  frame  to  another  since  it  is  measured  relative  to  the  observer.  The  limited  FOV 
implies  that  some  new  imagery  (thus  its  impact  times)  will  enter  the  impact  time  map  and  some 
old  images  will  move  out  of  the  map. 
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Time  k 


At  time  k-1,  the  estimated  terrain  point  i's  impact  time  vector  is  given  by 
^i,  k- 1  “  (^iz,  k- ^iz,  k- l^i’ k- 1^ 

from  the  perspective  projection.  At  time  k,  the  observer  and  his  associated  coordinate  system 
move  to  a  new  location.  In  the  observer-fixed  coordinate  system,  the  orientation  change  due  to 
rotations  can  be  determined  from  equation  (3.5-7a)  and  is  given  by 

Tk_  1  =  [RPY(i\_  lAt)]'^  (3.5-13) 


The  impact  time  change  due  to  the  translational  motion  can  be  determined  from  equation 
(3.5-la)  by  normalizing  the  equation  using  the  observer  speed  V^.i 


i^k-i  =  -flv,k-i^t 


(3.5-14) 


according  to  equation  (3.5-2b). 

Given  the  estimated  impact  time  vector  i ,  the  angular  velocity  estimate  j  ,  and  the 
heading  vector  estimate  Uy,  k  -  h  ^he  impact  time  vector  at  time  k  can  be  estimated  as 

ti,=  [RPY(A,_iAt)]Xk-i  -fiv,k-iAt]  (3-5-15) 

where  the  superscript  i  denotes  that  is  computed  using  the  impact  time  vector  ^  - 1  of  pixel  i 
at  time  k-1.  Note  that  because  of  the  observer  motion  and  the  limited  horizon,  the  estimated 
impact  time  vector  t{.  would  not  project  to  pixel  i  at  k,  but  to  new  image  coordinates  (x|;^,  ylc)^, 

determined  by 
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(x|c,  ylc)^  =  for  i  =  1,  2,  N 


(3.5-16) 


As  shown  in  figure  3.5-3,  the  closest  pixel  row  index  rh  and  column  index  n  for  the  image 
coordinates  (xj.,  y[.)^  can  be  computed,  respectively,  by 


rh  =  int[{ 


xj^Nkf/ 
tan  /  2) 


+  N4,-i-  l)/2] 


"'  =  "'«rS72)"N«-^l)/21 


(3.5-17) 

(3.5-18) 


where  int(d)  is  an  integer  operator  which  returns  an  integer  that  is  closest  to  the  value  d,  0  is  the 
vertical  FOV  angle,  Nq  is  the  number  of  the  vertical  pixels,  4^  is  the  horizontal  FOV,  angle,  and 
Nvj/  is  the  number  of  horizontal  pixels. 


1  m  N,p 


As  shown  in  figure  3.5-3,  an  image  point  (xJ,,  y^)^  either  stays  within  the  image  FOV  at  k  if  1 
<  th  <  N4/  and  1  <  n  <  N©,  or  moves  out  if  one  of  the  inequality  constraints  does  not  hold.  In  the 
first  case,  we  consider  pixel  j  =  (n  -  l)Nvj/  -H  m  at  k  has  an  impact  time  tj.  and  a  variance  P“i^, 

computed  in  the  prediction  phase  of  the  Kalman  filter  by 

^j:k  =  ^k  (3.5-19a) 

Pj:k  =  Pi,k/«  (3.5- 19b) 

where  a  <  1  is  a  discount  factor  accounting  for  the  uncertainty  in  impact  time  estimation  via  the 
numerical  extrapolation. 
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4.  System  demonstration  and  evaluation 

This  chapter  demonstrates  and  evaluates  overall  operation  of  the  ATAS.  We  show  that  the 
AT  AS  system  can  accurately  generate  navigation  critical  observer  motion  state  and  terrain  shape 
information,  such  as  observer  heading,  observer  angular  velocity,  and  the  observer  impact  time 
(depth)  map  across  all  pixels.  Section  4.1  describes  the  image  generation  and  defines  the 
performance  evaluation  criterion.  Section  4.2  presents  evaluation  results  using  flat  terrain  CGI, 
section  4.3  presents  the  evaluation  results  using  non-flat  terrain,  and  section  4.4  presents  the 
evaluation  results  using  flight-recorded  and  real-time  frame-grabbed  terrain  imagery.  Section  4.5 
concludes  the  chapter  with  a  summary  of  the  evaluation  results. 

4.1.  Image  Generation  and  Performance  Criterion 

Figure  4.1-1  illustrates  the  overall  architecture  of  the  image  generator,  which  converts 
computer  generated  imagery  (CGI)  or  real  imagery  generated  by  a  (digital  or  video)  camera  into 
a  sequence  of  digitized  (gray  level)  image  frames.  As  shown  in  figure,  images,  when  generated  in 
analog  format  either  via  a  computer  or  a  video  camera,  must  first  pass  through  a  frame  grabber  to 
be  grabbed  into  digitized  image  frames. 


Figure  4.1-1:  Image  Generator 


The  image  generator  has  two  operating  modes:  real-time  and  non-real  time,  as  shown  in 
figure  4.1-1  via  the  solid  and  dashed  lines,  respectively.  In  the  real-time  mode,  the  image 
generator  sends  the  digitized  images  directly  to  the  estimator  at  a  fixed  frame  rate  for  real-time 
estimation.  Consequently,  in  the  real-time  mode,  the  estimator  must  be  able  to  generate  each 
estimate  before  the  next  frame  arrives.  In  the  non-real  time  mode,  the  image  generator  stores  the 
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digitized  images  as  image  files  and  the  estimator  then  reads  the  digitized  image  frames  from 
image  files,  frame  by  frame.  The  estimator  can  thus  take  time  to  finish  an  estimate  before  it  reads 
the  next  image  frame  from  the  image  files. 

When  real  imagery  is  generated  using  a  digital  camera  or  when  CGI  is  generated  in  digital 
format  in  real-time,  real-time  operation  of  ATAS  then  depends  only  on  the  estimator's  host 
processor  being  fast  enough  to  finish  each  estimate  before  the  next  image  arrives.  When  real  or 
CGI  imagery  is  generated  in  analog  format,  whether  the  system  can  operate  in  real  or  non-real 
time  mode  then  also  depends  on  whether  or  not  the  frame  grabber  is  capable  of  real-time  image 
grabbing. 

The  image  generator  provides  both  the  CGI  and  real  imagery  for  on-line  motion  and  terrain 
shape  estimation.  For  CGI  generation,  we  developed  and  implemented  a  flat-terrain  image 
generator  that  produces  CGI  frames  simulating  perspective  images  viewed  by  an  observer 
(camera)  flying  over  a  flat  terrain  decorated  by  arbitrary  sinusoidal  patterns.  We  also  developed 
and  implemented  a  rolling-terrain  image  generator  that  produces  video  (analog)  imagery 
simulating  images  viewed  by  an  observer  flying  over  a  sinusoidal  rolling  terrain  decorated  by 
sinusoidal  patterns.  In  addition,  from  various  sources  we  obtained  other  complex  CGI  terrain 
images  used  widely  for  computer  vision  algorithm  testing.  For  real  image  generation,  we 
obtained  video  imagery  using  a  video  camera  fixed  on  a  helicopter  flying  over  real  terrain.  To 
convert  the  analog  video  imagery  into  the  digitized  image  frames,  we  implemented  a  frame 
grabber  using  hardware  and  software  frame  grabbing  tools  hosted  on  a  Macintosh  Quadra  840 
AV  computer. 

Section  4.1.1  briefly  describes  the  CGI  generation  process,  section  4.1.2  the  real  image 
generation  process,  and  section  4.1.3  the  image  grabbing  process.  Section  4.1.4  describes  the 
performance  evaluation  criterion. 

4.1.1  CGI  Generation 

Figure  4.1-2  is  an  overview  block  diagram  of  the  flat  terrain  image  generator,  which  can 
either  produce  image  data  files  to  support  the  non-real  time  evaluation  of  the  estimator,  or 
generate  digitized  image  frames  at  a  fixed  frame  rate  to  support  real-time  evaluation,  if  the  image 
generator  and  estimator's  host  processor(s)  has  sufficient  processing  power. 
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terrain  texture 
parameters 


parameters 


Figure  4.1-2:  Flat  Terrain  Image  Generator 

As  shown  in  the  figure,  three  modules  drive  the  flat-terrain  computer  image  generation: 
Terrain,  Observer,  and  Sensor.  The  terrain  module  defines  the  texture  pattern  on  the  ground 
plane,  to  be  imaged  by  the  observer  (sensor).  It  supports  fairly  complex  pattern  generation,  in 
which  pattern  intensity  is  a  two-dimensional  sum-of-sinusoids,  of  variable  spatial  frequency, 
intensity  amplitude,  and  spatial  phase  shift.  This  capability  supports  the  generation  of  texture 
patterns  which  span  the  range  of  appearance  from  perfectly  regular  to  totally  random. 

The  Observer  module  simulates  the  kinematics  involved  in  the  observer’s  flying  over  the 
ground  terrain.  The  module  generates  the  terrain-relative  observer  states  of  position  (R),  velocity 
(V),  attitude  (A),  and  angular  velocity  (O),  and  updates  them  every  simulation  time  step. 

The  Sensor  module  simulates  the  basic  geometric  attributes  of  the  imaging  observer's  sensor 
geometry  configuration.  It  specifies  the  sensor  boresight  in  the  vehicle  body-axis  coordinates 
(us),  as  well  as  the  size  of  the  FOV  (T^fov.  ©Fov)  ^nd  the  lateral  and  vertical  pixel  count  (Nxp,  N©). 

This  module  supports  arbitrary  boresighting  and  orientation  of  the  sensor  FOV,  for  general 
configurations. 

The  terrain  texture  parameters,  observer  states,  and  sensor  parameters  are  then  combined  by  a 
computer  image  generator  module;  a  computer  algorithm  that  generates  a  sequence  of  image 
frames  in  true  perspective,  as  seen  by  the  observer  in  flight  over  simulated  terrain.  Each  image 
frame  is  defined  by  a  set  of  N  intensity  scalars  {I;},  one  for  each  of  the  N  sensor  pixels.  One 
image  frame  is  produced  every  simulation  time  tick.  The  computation  algorithm  of  the  flat 
terrain  image  generation  is  implemented  in  the  C  programming  language. 

Figures  4.1-3  to  4.1-5  illustrate  a  group  of  image  frames  generated  by  the  flat  terrain  image 
generator.  Figure  4.1-3a  to  4.1-3d  show  images  with  a  single  sinusoidal  of  500  ft  wavelength 
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decoration  in  both  x  and  y  directions,  a  field-of-view  (FOV)  of  32°  x  32°,  and  an  observer  flying 
straight-and-level  at  an  altitude  of  200  ft  and  a  speed  of  300  ft/second  with  different  viewing 
depression  angles.  The  images  at  the  left  are  generated  at  a  high  resolution  of  128  x  128  pixels 
per  image  frame,  and  the  images  at  the  right  are  generated  at  a  low  resolution  of  32  x  32  pixels 
per  image  frame. 

As  shown  in  figure  4.  i-3a,  at  zero  boresight  depression  angle  and  with  a  resolution  of  128  x 
128  pixels,  the  generated  image  approximates  the  true  unpixellated  terrain  image  viewed  through 
the  sensor  FOV.  The  upper  half  is  sky,  and  has  a  neutral  grey  value.  The  lower  half  shows  the 
terrain  pattern,  which  naturally  recedes  to  the  horizon  in  the  2D  sinusoidal  pattern  used  to 
"decorate"  the  terrain.  With  a  low  resolution  of  32  x  32  pixels,  the  effects  of  pixellation  on  the 
image  are  clearly  evident,  as  shown  in  figure  2.2b. 

As  shown  in  figure  4.1 -3c,  at  a  larger  boresight  depression  angle  of  45°,  the  horizon  is  no 
longer  visible.  A  "flatter"  2D  sinusoidal  pattern  appears  in  the  image.  Similar  pixellation  effects 
are  also  clearly  evident. 


Figure  4,l-3a:  High 


Image 
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Figure  4.1-3c:  High  Resolution  Image  at  30°  Figure  4.1-3d;  Low  Resolution  Image  at  30° 
Depression  Angle  Depression  Angle 

Figures  4.1-4a  and  4.1-4b  show  45°  depression  angle  images  at  an  FOV  of  64°  x  64°,  at  high 
and  low  resolution,  respectively.  Comparing  with  images  4.1-3c  and  4.1-3d,  we  see  the  effects  of 
FOV  on  viewing  more  decorated  2D  sinusoidal  patterns. 


Figure  4.1-4a:  High  Resolution  Image  at  40°  Figure  4.1-4b:  Low  Resolution  Image  at  30° 
Depression  Angle  &  65°  FOV  Depression  Angle  &  65°  FOV 

Figures  4.1 -5a  and  4.1 -5b  show  45°  depression  angle  images,  at  a  shorter  wavelength  of  50 
feet,  at  high  and  low  resolutions,  respectively.  Comparing  with  images  4.1-3c  and  4.1-3d,  we  see 
clearly  the  effects  of  wavelength  on  the  decorated  2D  sinusoidal  patterns  of  the  viewed  image. 
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Figure  4.1-5a:  High  Resolution  Image  at  45°  Figuifc  4.1-5b:  Low  Resolution  Image  at  45 
Depression  Angle  &  50  ft  Wavelength  Depression  Angle  &  50  ft  Wavelength 


Figure  4.1-6  illustrates  an  overview  block  diagram  of  the  rolling  terrain  image  generator 
developed  on  a  Silicon  Graphics  (SG)  computer  (SG-Indigo  R4000  Elan).  Similar  to  the  flat 
terrain  image  generator,  the  rolling  terrain  image  generator  employs  a  terrain  module  to  define 
the  terrain  texture  pattern,  an  observer  module  to  simulate  observer  kinematics  involved  in  the 
observer’s  flying  over  the  ground  terrain,  and  a  sensor  module  to  specify  the  sensor  geometry 
configuration.  The  terrain  module  also  defines  rolling  terrain  topography  approximated  using 
sinusoidal  functions.  The  rolling  terrain  image  generator  does  not  directly  produce  digital  image 
frames,  but  instead  uses  built-in  graphics  engine  (implemented  in  hardware)  to  display  the 
perspective  view  of  textured  rolling  terrain,  that  is  subsequently  recorded  on  videotape. 


terrain  texture 
parameters 


parameters 


Figure  4.1-6:  Rolling  Terrain  Image  Generator 
Figure  4.1-7  shows  a  frame  of  a  rolling  terrain  image  sequence  that  was  first  generated  by  the 
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rolling  terrain  image  generator  and  then  grabbed  into  a  single  image  frame  by  the  frame  grabber. 
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camera  fixed  on  a  helicopter  flying  over  an  airfield.  The  helicopter  flies  straight  and  level  at  a 
speed  around  32  feet/sec  and  an  altitude  16  feet  above  the  ground.  The  camera  boresight  angle  is 
-8° ,  and  the  lateral  and  vertical  FOV  are  42°  and  44°,  respectively.  The  image  sequences  have  40 
image  frames,  each  having  a  size  of  512  x  477  pixels.  The  imagery  is  kept  in  two  formats:  image 
frame  data  files  and  video  tapes.  Figure  4.1-9  shows  a  frame  from  the  sequence. 


Figure  4.1-9:  An  Image  Frame  from  the  NASA  Image  Sequences 


4.1.3  Frame  Grabber 

Figure  4.1-10  illustrates  a  real-time  implementation  of  the  image  generator.  Without  a  real¬ 
time  digital  camera  system,  our  effort  concentrated  on  real-time  frame  grabbing  of  video  images. 
As  shown  in  the  figure,  an  image  video  tape  was  first  generated  either  using  a  camcorder  to  take 
pictures  in  flight  or  using  the  rolling  terrain  image  generator  described  in  section  3.1.  The  tape 
could  then  be  played  back  to  generate  the  video  images  in  the  ground-based  AT  AS  environment. 
Finally,  a  frame  grabber  was  implemented  for  real-time  frame  grabbing. 


Figure  4.1-10:  Real-Time  Image  Generator  Implementation 


We  implemented  the  frame  grabber  on  a  Macintosh  Quadra  840  AV.  We  chose  the 
Macintosh  Quadra  840  AV  computer  because  it  has  a  fast  40  MHz  microprocessor,  38  MB 


48 


R92191 


Charles  River  Analytics 


RAM,  built-in  digital  audio  and  video  (DAV)  hardware,  and  an  image  compression  board—  the 
SpigotPower  AV  that  supports  real-time  (30  Hz)  frame  grabbing.  Using  the  frame  grabbing 
software  tool  Video  Fusion  1.5.1,  the  AV  computer  accepts  image  signals  from  a  video  cassette 
recorder  (VCR),  digitizes  them,  and  then  outputs  the  digitized  image  frames  at  a  user-specified 
sampling  rate.  For  non-real  time  operation,  we  first  captured  the  grabbed  image  frames  in 
computer  RAM  and  then  stored  them  into  image  files.  For  real-time  operation,  the  grabbed  image 
frame  is  sent  directly  to  the  estimator  running  on  the  same  or  another  computer. 

Figures  4.1-1  la  shows  a  frame  of  the  grabbed  image.  Compared  with  the  ideally  generated 
image  frame  shown  in  4. 1-1  lb,  deterioration  of  image  quality  during  the  image  grabbing  process 
is  evident. 


Figure  4.1-lla;  Grabbed  Image  Figure  4.1-llb:  Ideally  Generated  Image 

The  major  problem  with  real-time  frame  grabbing,  however,  is  not  that  the  images  cannot  be 
grabbed  at  high  quality,  but  that  the  frame  grabber  does  not  support  flicker-free  frame  grabbing. 
Two  problems  persist.  The  first  is  that  the  time  (rate)  needed  for  grabbing  an  image  depends  on 
the  contents  of  the  image.  Consequently,  each  image  is  grabbed  at  a  different  rate  since  all 
images  are  different.  In  addition,  the  frame  grabber  does  not  provide  information  on  the 
individual  frame  grabbing  rate.  Without  a  constant  grabbing  rate  or  knowledge  of  the  individual 
grabbing  rate,  it  is  impossible  for  the  egomotion  and  terrain  shape  estimator  to  properly  integrate 
estimates  from  multiple  images  in  the  Kalman  filtering  process. 

The  second  problem,  the  more  severe  one,  is  that  the  frame  grabber  has  an  unwanted 
adaptation  function  that  is  built  into  its  frame  grabbing  process.  If  the  frame  grabber  does  not 
complete  a  single  frame  grab  within  the  user  specified  frame  time,  it  either  automatically  drops 
the  incoming  image  frame,  outputting  the  result  from  the  previous  frame  as  the  current  one,  or 
grabs  a  partial  image  from  the  previous  frame,  continuing  from  where  it  stops  on  the  current  one. 
This  adaptation  function  adds  considerable  noise  to  the  grabbed  image  sequence. 

It  should  be  note,  however,  that  neither  problem  occured  with  the  frame  grabbed  image 
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sequence  obtained  by  NASA  under  real-time  recording  conditions.  Thus,  for  ATAI  testing,  the 
images  would  be  considered  as  eithe  generated  in  real-time  (via  direct  sensor  feed  to  the 
estimator)  or  as  recorded  and  subsequently  frame-grabbed  in  real-time. 

4.1.4  Performance  Criterion 

To  evaluate  system  performance  for  both  the  computer  generated  and  real  images,  several 
performance  indices  are  defined.  These  include: 

®  Aimpoint  Error:  This  is  obtained  from  the  angular  difference  between  true  heading  Uy  and 
heading  estimated  by  AT  AS.  The  aimpoint  error  reflects  total  aim  estimation  error  in 

both  vertical  and  lateral  directions,  and  is  computed  by 

Eaitn  =  cos- '  K  ®  U^)  (4.1-1) 

°  Angular  Rate  Error:  This  is  obtained  from  the  magnitude  of  the  difference  between  actual 
angular  velocity  Q.  and  angular  velocity  O  estimated  by  the  system.  The  angular  rate  error 
reflects  total  body-rate  error  across  all  three  body  axes,  and  is  computed  by 

P  —  1  Ck _ I 

targ  P'S  (4.1-2) 

®  Impact  Time  Error:  This  is  obtained  by  first  determining  the  difference  between  the  true 
pixel-dependent  terrain- surface  impact  time  with  the  corresponding  estimate  generated 
by  the  AT  AS  system,  at  each  visible  ground  pixel,  and  then  averaging  the  results.  Note  that 
when  the  impact  time  is  computed  in  the  navigation  system,  or  in  the  body  coordinate 
system  of  an  observer  undergoing  straight  and  level  flight,  the  average  impact  time  equals 
the  observer's  altitude  normalized  by  his  speed.  Assuming  that  the  system  has  access  to  an 
accurate  estimate  of  the  observer  speed  v,  this  error  index  reflects  the  precision  of  the 
system  in  estimating  the  effective  altitude  (range)  of  the  observer  above  the  terrain,  in  time 
units. 

®  Across-Frame  Root-Mean  Square  (RMS)  Error:  This  is  obtained  by  first  computing  the 
error  mean  |i,  and  stand  deviation  s  across  multiple  frames.  The  RMS  error  is  then 
computed  via 

RMS  =  +  s2  (4.1-3) 

4.2  Evaluation  Using  Flat  Terrain  CGI 

Two  kinds  of  evaluations  using  the  flat  terrain  CGI  were  performed  to  examine  the 
performance  of  ATAS:  the  evaluation  of  system  estimation  accuracy  and  the  evaluation  of  the 
system  for  modeling  the  human  flow-based  egomotion  process.  These  evaluations  are  presented 
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in  sections  4.2.1  and  4.2.2,  respectively. 


4.2.1  Estimation  Accuracy  with  Flat  Terrain  CGI 

A  sequence  of  images  was  computer  generated  using  the  flat-terrain  image  generator 
illustrated  in  figure  3.2-1.  The  simulated  vehicle  flies  straight-and-level  with  a  speed  of  300 
ft/sec  at  an  altitude  of  200  feet  above  the  terrain,  with  zero  body  attitude  rates.  The  vehicle  sensor 
was  boresighted  15°  below  the  horizon.  The  sensor  field-of-view  (FOV)  was  set  to  32°  laterally 
by  32°  vertically,  and  the  pixel  count  was  32  laterally  by  32  vertically  to  yield  a  nominal  1°  by  1° 
pixel.  The  simulated  terrain  is  flat  and  featureless  and  is  textured  via  a  spatial  2D  sinusoidal 
pattern  with  a  wavelength  of  500  feet  in  both  vertical  and  lateral  directions. 

Figure  4.2-1  shows  one  image  frame  generated  under  the  above  conditions.  With  a  resolution 
of  128  X  128  pixels,  the  figure  approximates  the  true  unpixellated  terrain  image  seen  through  the 
sensor  field-of-view.  The  upper  10%  of  the  figure  is  sky,  and  has  a  neutral  gray  value.  The  lower 
portion  below  the  horizon  shows  the  terrain  pattern,  which  naturally  recedes  to  the  horizon.  Note 
the  regular  2D  sinusoidal  pattern  used  to  "decorate"  the  surface.  Figure  4.2-2  illustrates  the  image 
actually  imaged  by  the  sensor  at  the  32  x  32  pixel  resolution.  The  effects  of  pixellation  are 
clearly  evident. 
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In  the  simulation,  a  sequence  of  consecutive  images  like  these  are  generated,  and  then 
processed,  frame  by  frame,  by  ATAS  to  obtain  the  state  and  terrain  shape  estimates.  For  the 
simulations  described  in  this  section,  the  interframe  sample  time  is  set  at  0.02  sec  (a  frame  rate  of 
50  Hz),  so  that  an  imaged  terrain  point  in  the  center  of  the  sensor  focal  plane  will  move  at  about 
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one  pixel/frame,  at  the  given  nominal  altitude,  speed,  and  boresight  angle. 

Using  the  same  image  sequences,  we  first  evaluated  the  geometry  based  egomotion  and 
terrain  shape  estimator  (section  4.2. 1.1),  and  then  the  structure  based  estimator  (section  4.2. 1.2). 

4.2. 1. 1  Geometry  Based  Estimator 

Figures  4.2-3  to  4.2-6  illustrate  the  motion  state  and  terrain  parameter  estimation  time 
histories  for  AT AS  using  the  geometry  based  estimator.  The  system  used  a  flat  terrain  model, 
and  assumed  an  initial  +5°  error  in  both  the  flight  path  and  the  heading  angles  to  reflect  the  initial 
uncertainty  on  the  observer  motion  state.  The  model  standard  deviations  for  heading  and  path 
angles  were  set  at  0.01°,  the  model  variances  for  angular  velocity  vector  were  set  at  0.01  “/sec^ 
for  each  individual  vector  element.  The  discount  factor  for  terrain  impact  time  (range)  map 
evolution  was  set  at  0.9. 

In  the  figures,  the  solid  line  represents  the  snapshot  estimates,  the  thick  solid  line  represents 
the  Kalman  filter  smoothed  estimates,  and  the  dashed  line  represents  the  true  observer  state  and 
terrain  shape  parameters. 

Figure  4.2-3  shows  a  time  history  for  the  estimates  of  the  observer  impact  time 
(altitude/speed),  which  is  the  cross-pixel  average  of  the  impact  time  map  in  the  body  coordinate 
system.  As  shown,  both  the  snapshot  and  filtered  impact  times  oscillate  around  the  actual  impact 
time.  Note  that  a  larger  variation  occurs  in  the  snapshot  estimate,  and  the  effects  of  the  Kalman 
filter  in  smoothing  the  oscillation  of  the  snapshot  estimate  and  improving  estimation  accuracy  are 
clearly  demonstrated.  Oscillation  in  the  estimate  is  due  to  the  periodic  nature  of  the  terrain 
sinusoidal  pattern  being  viewed. 


Figure  4.2-3:  Impact  Time  Estimation  History 
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Figures  4.2-4  and  4.2-5  illustrate  time  histories  of  the  estimated  heading  and  flight  path 
angles,  respectively.  As  illustrated,  the  snapshot  heading  angle  estimate  oscillates  within  a  0.5° 
range  around  the  true  heading  vector,  while  the  snapshot  flight  path  angle  estimate  oscillates 
within  a  1.5°  range  with  about  a  1.5°  bias.  The  filtered  heading  angle  converges  to  the  true 
heading  angle,  while  the  filtered  flight  path  angle  converges  with  about  a  1.5°  b  as. 


Figure  4.2-4;  Heading  Angle  Estimation  History 


'  Figure  4.2-5:  Flight  Path  Angle  Estimation  History 


Figure  4.2-6  shows  a  time  history  of  the  estimated  pitch  (y-axis)  angular  velocity.  As  shown, 
both  the  snapshot  and  filtered  estimates  oscillate  with  a  bias  of  2.5°.  Again,  the  effects  of  Kalman 
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filtering  in  smoothing  the  oscillation  of  the  snapshot  stimate  and  imrt ing  estimation  accuracy 
are  clearly  demonstrated.  The  x  and  z-axis  angular  '■  docity  estimates  are  not  shown,  since  they 
estimated  the  true  anagular  rates  (which  are  zero)  wii  errors  of  less  than  0.01°. 


Frame  Count 


Figures  4.2-7  to  4.2-9  illustrate  the  snapshot  and  filtered  estimation  error  histories  for  the 
motion  state  and  terrain  parameters.  As  shown  in  figure  4.2-7,  the  snapshot  impact  time  estimate 
has  an  oscillating  error  of  0%  to  15%,  which  is  reduced  to  within  a  5%  error  zone  after  Kalman 
filtering. 
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Figure  4=2-7:  Impact  Time  Error  History 

Figure  4.2-8  illustrates  the  snapshot  and  the  filtered  aimpoint  estimation  error  histories.  The 
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snapshot  estimation  error  oscillates  between  2°  to  4°,  while  the  filtered  eS;  nation  error 
converges  to  about  1.2°. 


Figure  4.2-8:  Aiiupoint  Error  History 


Figure  4.2-9  illustrates  the  snapshot  and  the  filtered  angular  rate  estimation  error  histories. 
The  snapshot  estimation  error  shows  an  oscillating  error  between  1.5  to  4°,  while  the  filtered 
estimation  error  shows  a  slightly  oscillating  3°  error. 


Figure  4.2-9:  Angular  Rate  Error  History 


RMS  errors,  using  the  geometry  based  estimator,  in  the  filtered  impact  time,  aimpoint,  and 
angular  rate  are  3.02%,  1.09°,  and  2.77°/sec,  respectively. 
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4.2.2  Structure  Based  Estimator 

Figures  4.2-10  to  4.2-16  illustrate  the  motion  state  and  terrain  parameter  estimation  time 
histories  for  AT  AS  using  the  structure  based  estimator.  The  system  used  a  structure  of  100  layers 
with  a  0.01  equal  relative  distance  between  layers.  It  also  assumed  a  system  model  standard 
deviation  0.01°  for  heading  and  path  angles,  0.01  °/sec  for  each  element  of  the  angular  velocity 
vector.  The  discount  factor  for  terrain  impact  time  (range)  map  evolution  was  set  at  0.9. 

In  the  figures,  the  solid  line  represents  the  snapshot  estimates,  the  thick  solid  line  represents 
the  Kalman  filter  smoothed  estimates,  and  the  dashed  line  represents  the  actual  observer  state  and 
terrain  shape  parameters. 

Figure  4.2-10  shows  time  histories  of  the  estimated  impact  times.  As  shown,  both  the 
snapshot  and  Kalman  filtered  impact  times  oscillate  around  the  actual  impact  time.  Note  that  a 
larger  variation  occurs  in  the  snapshot  estimate,  and  the  effects  of  the  Kalman  filtering  in 
smoothing  the  oscillation  of  the  snapshot  estimate  and  improving  estimation  accuracy  are  clearly 
demonstrated. 


Figure  4.2-10:  Impact  Time  Estimation  History 


Figures  4.2-11  and  4.2-12  illustrate  time  histories  of  the  estimated  heading  and  flight  path 
angles,  respectively.  As  illustrated,  the  snapshot  heading  angle  estimate  oscillates  within  a  1.0° 
range  around  the  true  heading  vector,  while  the  snapshot  flight  path  angle  estimate  oscillates  in  a 
±5°  range.  The  filtered  heading  vector  converges  to  the  true  heading  angle,  while  the  filtered 
flight  path  angle  converges  with  about  a  2.5°  bias. 
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Frame  Count 


Figure  4.2-11:  Heading  Angle  Estimation  History 


Frame  Count 


Figure  4.2-12:  Flight  Path  Angle  Estimation  History 

Figure  4.2-13  shows  a  time  history  of  the  estimated  pitch  (y-axis)  angular  velocity.  As 
shown,  the  snapshot  estimate  oscillates  about  the  true  angular  velocity,  while  the  filtered  estimate 
oscillates  with  a  bias.  The  x  and  y-axis  angular  velocity  estimates  are  not  shown.  They  have  a 
small  error  of  less  than  0.01°. 
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Figure  4.2-13:  Pitch  Angular  Velocity  Estimation  History 


Figures  4.2-14  to  4.2-16  illustrate  the  filtered  estimation  error  histories  for  the  motion  state 
and  terrain  parameters.  As  shown  in  figure  4.2-14,  the  filtered  impact  time  error  oscillates  around 
a  negative  10%  bias  toward  the  smaller  impact  time. 


Figure  4.2-14:  Impact  Time  Error  History 


Figure  4.2-15  illustrates  the  filtered  aimpoint  estimation  error  history.  The  filtered  estimation 
error  oscillates  between  1°  to  2°. 
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Figure  4.2-15:  Aimpoint  Error  History 


Figure  4.2-16  illustrates  the  filtered  angular  rate  estimation  error  history.  The  filtered  angular 
velocity  estimation  has  a  very  small  error  of  less  0.2°. 


Figure  4.2-16:  Angular  Rate  Error  History 


RMS  errors,  using  the  structure  based  estimator,  in  the  filtered  impact  time,  aimpoint,  and 
angular  rate  are  17.02%,  1.52°,  and  0.13°/sec,  respectively. 

4.2.2  Comparison  ofATAS  Predictions  with  Human  Experiments 

We  now  compare  results  generated  by  AT AS  with  some  results  obtained  in  two  experiments 


59 


R92191 


Charles  River  Analytics 


conducted  by  Warren  (1976)  and  Grunwald  &  Kohn  (1993).  Detailed  explanations  of  the 
experiments  are  given  in  the  respective  references.  Section  4.2.2. 1  presents  the  comparison  with 
Warren's  experiment,  while  section  4. 2. 2.2  presents  the  comparison  with  Grunwald  and  )hn’s 
experiment. 

4.2.2. 1  Comparison  with  Warren's  Experiment 

In  the  experiment  conducted  by  Warren  (1976),  subjects  were  "flown"  over  a  flat  terrain 
decorated  by  a  uniformly  random  array  of  luminous  dots.  The  flight  path  was  straight  and  level, 
with  an  altitude  of  d  units  and  speed  1.25d  units/s,  where  d  was  the  average  inter-dot  spacing  of 
the  terrain  texture.  The  lateral  FOV  was  53.1  deg,  and  the  vertical  FOV  was  26.5  deg  below  the 
displayed  horizon.  After  viewing  the  display,  the  subjects  were  asked  to  indicate  their  lateral 
heading  aim  point,  and  were  able  to  do  so  with  an  accuracy  of  1.5  deg  standard  deviation  (across 
subjects). 

To  simulate  the  experimental  environment,  a  sequence  of  computer  images  were  generated 
by  using  the  AT AS  flat  terrain  image  generator  described  in  section  3.1  The  simulated  vehicle 
flew  straight-and-level  at  an  altitude  of  100  feet  above  the  terrain,  with  zero  body  attitude  rates. 
The  frame  time  between  images  was  set  as  0.033  s  (30  Hz)  for  consistency  with  our  analysis  of 
the  Grunwald  &  Kohn  (1993)  experiment  (see  below).  The  terrain  scene  was  approximated  by  a 
two-dimensional  sum-of-sines  spatial  texture  pattern,  having  a  fundamental  wavelength  of  100  ft, 
with  coefficients  chosen  to  approximate  the  luminous  dot  pattern  used  in  the  original  experiment. 
The  image  plane  was  set  normal  to  the  observer’s  velocity  vector  with  a  zero  depression  angle 
relative  to  the  terrain.  The  sensor  FOV  was  set  to  53. 1°  laterally  by  53.1°  vertically,  and  the  pixel 
count  was  set  at  100  laterally  by  100  vertically,  for  a  total  of  10,000  pixels. 

Figure  4.2-17  shows  one  image  frame  generated  under  the  above  conditions.  The  upper  half 
of  the  figure  is  sky,  and  has  a  neutral  gray  level.  The  lower  portion  below  the  horizon  shows  the 
array  of  luminous  dots,  which  naturally  recedes  to  the  horizon.  The  effects  of  pixellation  are 
clearly  evident. 

In  the  simulation,  a  sequence  of  these  images  are  generated,  and  then  processed,  frame  by 
frame,  using  the  previously  described  algorithm  to  obtain  estimates  of  the  observer  state  and 
terrain  parameters.  Figure  4.2-18  illustrates  a  typical  resulting  time  history  of  the  heading  error, 
computed  as  the  difference  between  the  actual  and  estimated  heading  angle.  The  thin  line  shows 
estimation  error  generated  by  the  snapshot  least  squares  estimator.  The  high  frequency  content  of 
this  signal  directly  reflects  the  frame-by-frame  variation  in  the  image.  The  thick  line  shows 
estimation  error  generated  by  the  Kalman  filter,  which  provides  smoothing  by  incorporating  the 
observer  dynamics.  As  can  be  seen,  the  algorithm  provides  excellent  estimation  of  the  observer’s 
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heading  by  effectively  exploiting  the  available  temporal  and  spatial  visual  information. 


-  Snapshot  Bti  mat  or 


Frame  Count 


The  level  of  performance  clearly  exceeds  that  achieved  by  human  observers  but  this  is  to  be 
expected  given  the  essentially  error-free  optimization  implemented  by  the  model.  To  better 
match  human  empirical  data,  we  assume  that  the  human  visual  system  is  not  noise  free,  but  can 
be  modeled  as  a  noisy  generator  of  the  ideal  image.  In  our  simulation,  we  implemented  this  by 
adding  zero  mean  Gaussian  random  noise  directly  to  the  (pixellated)  image,  choosing  the 
standard  deviation  (SD)  of  the  image  noise  level  to  be  fixed  percentage  of  the  overall  gray  level 
of  the  image.  Figure  4.2-19  illustrates  the  resulting  heading  error  estimation  time  history  with  a 
10%  image  noise  level  (noise  SD  equals  10%  of  the  overall  image  intensities).  Clearly,  this 
injection  of  image  noise  significantly  increases  the  magnitude  of  the  snapshot  or  least  squares 
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estimate  (note  scale  differences  with  figure  4.2-18),  as  well  as  that  of  the  Kalman  filter  estimate. 


Snapshot  Estimator 
****^'^'  ''^  Kalman  Filter 


Figure  4.2-19:  Heading  Estimation  Error  Time  History  (10%  Image  Noise) 

With  this  facility  for  injection  of  image  noise,  we  then  conducted  a  number  of  Monte  Carlo 
simulations  to  generate  statistics  of  the  resulting  estimation  error,  for  different  image  noise 
levels.  For  each  run,  we  computed  the  root  mean  squared  (RMS)  estimation  error  over  a  response 
time  window,  starting  from  1.5  sec  following  stimulus  presentation  (assuming  subjects  needed  at 
least  1.5  sec  to  respond  and  running  to  4.0  sec  following  stimulus  presentation  (when  model 
response  suggests  no  further  reduction  in  the  estimation  error).  We  then  computed  the  mean  and 
standard  deviation  of  the  RMS  estimation  error  across  30  Monte  Carlo  simulations  of  the  same 
experimental  condition.  This  process  was  then  replicated  for  several  different  image  noise  levels. 

Figure  4.2-20  shows  mean  (triangles)  and  standard  deviation  (error  bars)  of  the  RMS 
estimation  error  as  predicted  by  the  model  for  three  different  image  noise  levels:  5%,  10%,  and 
20%  (the  curve  is  an  interpolation  between  conditions).  We  see  monotonically  increasing  error 
with  image  noise  level.  We  have  also  indicated  on  the  figure  the  1 .5  deg  estimation  error 
obtained  under  the  experiment  conducted  by  Warren  (1976),  and  have  indicated  how  this 
performance  level  can  be  “explained”  as  an  8%  perceptual  image  noise  level,  within  the  context 
of  the  model  we  have  presented  here. 
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Noise  Level  (%) 

Figure  4.2-20:  Estimation  Error  vs.  Perceptual  Noise  Level 

4.2.2.2  Comparison  with  Grunwald  and  Kohn’s  Experiment 

In  one  portion  of  the  experiment  conducted  by  Grunwald  and  Kohn,  a  subject  was  "flown" 
over  flat  terrain  decorated  by  a  field  of  rectangular  patches  placed  in  a  cross-grid  pattern.  As  with 
Warren’s  experiment,  the  flight  path  was  straight  and  level  and  the  image  plane  was  normal  to 
the  observer’s  velocity  vector.  The  field  of  view  was  102.7  deg  horizontal  and  90  deg  vertical. 
Observer  altitude  was  set  at  d  units  above  the  ground  plane.  The  terrain  was  visible  from  a 
distance  Id  until  15d  units  from  the  observer.  The  average  length  and  width  of  the  viewed 
rectangles  was  0.385d,  and  the  average  distance  between  their  centers  was  0.625d.  In  each  trial, 
the  lateral  flight  path  deviated  from  the  viewing  axis  by  the  sideslip  angle  (3,  which  was  chosen 
from  a  uniformly  distributed  random  set  ranging  from  -45  to  +45  deg.  The  subject's  task  was  to 
place  a  marker  on  the  estimated  aimpoint.  The  subject  was  given  a  maximum  of  8  sec  to  respond, 
after  which  the  run  was  terminated.  In  the  experiment,  different  velocity-to-height  ratios  were 
used,  ranging  from  0.25  to  4s'^. 

In  our  simulation  of  this  experiment,  the  simulated  vehicle  also  flies  straight-and-level  at  an 
altitude  of  100  feet  above  the  terrain,  with  zero  body  attitude  rates.  The  frame  time  between 
images  was  set  as  0.033  s  (30  Hz),  which  is  the  screen  update  frequency  of  the  Silicon  Graphics 
IRIS  4D  50/GT  workstation  used  in  the  experiment.  The  terrain  scene  was  approximated  by  a 
two-dimensional  sum-of-sines  texture  pattern,  of  regularly  distributed  square  patches  of  size  38.5 
ft,  spaced  77.0  ft  apart.  The  image  plane  was  set  normal  to  the  observer’s  velocity  vector.  The 
sensor  FOV  was  set  to  90°  laterally  by  90°  vertically,  and  the  pixel  count  was  set  at  100  laterally 

by  100  vertically,  for  a  total  of  10,000  pixels. 

Figure  4.2-21  shows  one  image  frame  generated  under  the  above  conditions.  Figure  4.2-22 
illustrates  a  typical  time  history  of  the  heading  error,  obtained  during  our  Monte  Carlo  analysis 
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of  this  experiment,  using  a  15%  image  noise  level  to  simulate  hum  r  perceptual  limitations.  As 
before,  the  thin  line  shows  the  output  of  the  snapshot  least  square  ,  estimator  (before  filtering) 
and  the  thick  line  shows  the  output  of  the  Kalman  filter. 


Figure  4.2-21:  CGI  used  for  Simulating  Grunwald  and  Kohn's  Experiment 

-  Snapshot  Estimator 


Kalman  Filter 


Figure  4.2-22:  Heading  Estimation  Error  Time  History  (15%  Image  Noise) 

As  with  our  analysis  of  the  experiment  by  Warren,  we  varied  the  image  noise  level  to  find  a 
best  match  to  the  experimental  data,  across  all  conditions  tested.  For  this  experiment,  we  found  a 
15%  image  noise  level  to  provide  the  best  overall  model  match  to  the  data.  Figure  4.2-23  shows 
the  resulting  RMS  estimation  error  in  aimpoint,  as  a  function  of  velocity-to-height  ratio  (the 
"global  flow  rate"  of  Warren).  Model  predictions  are  indicated  by  the  shaded  region,  defined  as 
the  plus/minus  one-sigma  envelope  about  the  expected  RMS  error.  Experimental  data  points  are 
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indicated  by  the  solid  circles.  Note  f'nat  the  data  shows  a  clear  trend  in  which  errors  vary 
inversely  with  velocity-to-height  ratio,  with  the  lower  flow  rates  yielding  higher  errors.  This 
trend  is  replicated  by  the  model  results,  as  we  expect  with  extreme  (in  this  case,  very  low)  flow 
rates.  Although  not  covered  in  this  experiment,  we  would  expect  a  gradual  increase  in  aimpoint 
estimation  error,  as  flow  rates  are  increased,  so  that  an  overall  U-shaped  function  is  obtained. 


Figure  4.2-23:  Heading  Estimation  Error  vs.  Velocity  Height  Ratio  (Data  vs.  Model) 

Using  AT  AS,  we  analyzed  several  earlier  experimental  studies  of  human  egomotion 
perception.  For  the  two  reported  on  here,  we  used  simple  additive  image  noise  to  model  the 
human's  inability  to  register  perfectly  noise-free  images,  and  we  were  able  to  show  how  this 
“perceptual  noise”  leads  to  corresponding  inaccuracies  in  estimates  of  egomotion  state.  In 
particular,  we  showed  how,  within  the  model  context,  an  assumed  10%  to  15%  image  noise  level 
leads  directly  to  model-predicted  egomotion  aimpoint  accuracy  in  the  experimentally  observed 
range  of  1.5  to  2.5  deg.  In  addition,  we  showed  how  these  aimpoint  accuracy  can  be  expected  to 
co-vary  with  velocity-to-height  ratios,  which  change  by  over  a  factor  of  10,  and  confirmed  these 
trends  with  experimental  data  showing  the  same  trends.  Although  more  model  validation  is 
clearly  needed,  we  feel  that  these  initial  matches  between  model  and  data  are  very  encouraging. 

4.3  Evaluation  Using  Non-Flat  Terrain  CGI 

We  also  tested  the  egomotion  and  terrain  shape  estimator  on  complicated  non-flat  terrain 
images,  specifically,  the  Yosemite  sequence,  which  was  created  by  Lynn  Quam  and  provided  to 
us  by  Dr.  J.  L.  Barron  (1994).  Figure  4.3-1  shows  one  frame  of  the  15  frame  Yosemite  sequence. 
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Figure  4.3-1:  Single  Frame  of  Yosemite  Image  Sequence 

This  sequence  is  challenging  because  of  its  complicated  terrain  structure,  which  is  beyond  the 
capability  of  any  simple  polynomial  geometry  model  (flat,  quadratic,  etc.).  As  a  matter  of  fact, 
when  we  tried  to  use  the  geometry-based  estimator  for  egomotion  and  terrain  shape  estimation, 
the  results  were  not  even  remotely  correlated  with  the  true  motion  state  and  terrain  shape. 

We  used  the  structure -based  estimator  for  egomotion  and  terrain  shape  estimation  of  the 
Yosemite  sequence.  For  the  snapshot  estimator,  a  512  equal  distance  layer  structure  was  assumed 
with  the  relative  distance  between  layers  set  as  0.01.  For  the  Kalman  filter,  the  model  standard 
deviations  for  the  heading  and  path  angles  were  set  at  0.01°,  and  the  model  standard  deviations 
for  the  angular  velocity  vector  were  set  at  0.01  °/sec  for  each  individual  vector  element.  The 
discount  factor  for  terrain  impact  time  (range)  map  evolution  was  set  at  0.9. 

Figures  4.3-2  and  4.3-3  illustrate  the  intensity-coded  third  and  ninth  depth  (impact  time) 
maps  generated  directly  from  the  estimator  without  any  post  run  image  processing  (filtering, 
sharpening,  noise  reduction,  etc.).  Lighter  pixels  in  the  map  are  closer  to  the  observer.  The  white 
pixels  are  either  pixels  that  are  above  the  horizon  or  pixels  that  have  zero  image  gradients,  thus 
have  no  information  available  for  structure  extraction. 
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Figure  43-3:  The  Ninth  Impact  Time  Map  Generated  from  Yosemite  Sequence 


Note  that  the  third  map  has  many  white  pixels,  indicating  unknown  depths  (impact  times)  at 
those  points  due  to  lack  of  gradient  information  (or  image  flow).  After  six  frames  of  Kalman 
filter  integration  of  the  depth  maps,  the  ninth  map  shows  a  much  smoother  impact  time  map  with 
fewer  white  pixels.  Although  no  quantitative  comparison  between  the  true  and  estimated  impact 
time  map  is  provided,  qualitatively,  it  is  evident  that  the  structure-based  estimator  has  a 
capability  to  recover  the  structure  of  the  terrain. 

For  egomotion  state  estimation,  the  actual  heading  vector  was  given  as  (0.0,  0.17,  0.98)  by 
Heeger  and  Jepson  (1992)  and  was  constant  over  the  15  frame  sequence.  The  estimated  heading 
vector  was  (-0.10,  0.18,  0.98)  for  the  first  estimate  (the  third  frame)  and  (-0.09,  0.18,  0.98)  for 
the  last  estimate.  Figure  4.3-4  shows  the  aimpoint  error  history  for  the  15  frames.  As  shown  in 
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the  figure,  the  average  aimpoint  error  was  less  then  5°  over  the  15  frame  sequence. 


Figure  4.3-4:  Aimpoint  Error  History 


The  actual  rotation  rate  was  given  as  0.095  degrees/frame.  Figure  4.3-5  shows  the  angular 
rate  error  history  for  the  15  frames.  The  estimated  average  rotation  rate  over  the  15  frame 
sequence  was  0.032  Vframe.  The  angular  rate  error  was  less  than  0.06°  over  the  15  frame 
sequence. 


Frame  Number 

Figure  4.3-5:  Angular  Rate  Error  History 
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4.4.  Evaluation  Using  Real  Imagery 

Evaluation  using  real  imagery  was  done  using  the  flight-recorded  and  real-time  frame 
grabbed  terrain  image  sequence  obtained  from  the  NASA  Ames  Research  Center.  The  image 
sequence  was  acquired  by  a  camera  mounted  under  a  helicopter  nose  and  oriented  roughly  in  the 
direction  of  flight.  The  position  of  the  camera  and  its  orientation  with  respect  to  the  helicopter 
were  held  constant  throughout  the  flight.  The  methodology  used  to  generate  the  image  sequence 
and  truth  data  are  described  in  Smith  (1990).  The  helicopter  was  flying  with  a  speed  of 
approximately  20  knots  at  an  altitude  of  15  ft  above  a  runway.  The  images  were  captured  at  30 
Hz,  so  that  the  helicopter  moved  approximately  1  ft  between  successive  images.  Figures  4.4- la 
and  4.4-2b  show  the  first  and  the  last  images  of  a  sequence  of  90  images.  Each  image  has  a 
resolution  of  442  x  512  pixels  at  256  gray  levels. 


Using  the  same  image  sequence,  we  first  evaluated  the  geometry  based  egomotion  and  terrain 
shape  estimator  (section  4.4.1),  and  then  the  structure  based  estimator  (section  4.4.2). 

4.4.1  Geometry-Based  Estimator 

The  geometry  model  based  estimator  was  tested  first.  For  the  snapshot  estimator,  a  flat  terrain 
model  was  adopted.  For  the  Kalman  filter,  the  model  standard  deviations  for  heading  and  flight 
path  angle  dynamics  were  set  at  0.0  ij,  amd  the  model  standard  deviations  for  angular  velocity 
dynamics  were  set  at  0.01  J/sec^’  equivalently  for  three  angular  velocity  components.  The 
discount  factor  for  terrain  impact  time  (range)  map  evolution  dynamics  was  set  as  0.9.  In 
processing  the  images,  the  pixels  above  the  horizon  were  not  used. 
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4.4. 1.1  Testing  Using  the  Original  Images 

We  first  used  the  images  at  their  original  442  x  512  pixel  resolution  without  any  pre¬ 
processing  of  images.  Figures  4.4-2  to  4.4-5  show  time  histories  for  the  estimated  helicopter 
impact  time  (altitude/speed),  heading  angle,  flight  path  angle,  and  three  individual  angular 
velocity  components.  The  plots  are  drawn  by  plotting  the  estimate  and  measured  state  and  terrain 
parameters  against  frame  count  with  a  frame  interval  time  of  1/30  sec.  In  the  figures,  the  solid 
line  represents  the  snapshot  estimates,  the  thick  solid  line  represents  the  Kalman  filter  smoothed 
estimates,  and  the  dashed  line  represents  the  helicopter  state  and  terrain  parameters  measured 
using  on-board  navigation  systems. 

As  shown  in  figure  4.4-2,  the  impact  time  estimate  has  quite  a  large  error.  Two  factors 
contribute  to  this.  The  first  is  the  lack  of  intensity  contrast  in  the  "non-decorated”  real  imagery, 
as  demonstrated  in  figure  4.4-1.  The  second  is  the  small  pixel  size  that  leads  to  a  high 
noise/signal  ratio  in  the  image  gradient  computation.  These  two  factors,  however,  seem  not  affect 
the  motion  state  estimation  accuracy  adversely.  Figures  4.4-3  and  4.4-4  illustrate  the  time 
histories  of  the  estimated  heading  and  flight  path  angles,  respectively.  The  ability  of  the  system 
to  track  the  helicopter  motion  state  dynamics  is  clearly  demonstrated  in  plots  of  the  three  angular 
velocity  components,  shown  in  figures  4.4-5,  4.4-6,  and  4.4-7,  respectively.  Note  how  closely 
and  swiftly  each  estimated  angular  velocity  component  follows  the  dynamics  of  its 
corresponding  true  angular  velocity  component. 


Figure  4.4-2:  Impact  Time  Estimation  History 
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Figure  4.4-7:  Angular  Rate  Error  History 


Figures  4.4-6  and  4.4-7  show  time  histories  of  the  aimpoint  error  and  angular  rate  error, 
respectively.  After  the  first  30  frames  (1  sec),  the  aimpoint  error  stays  below  2.5°,  and  the 
angular  rate  error  stays  below  2°/sec.  The  RMS  errors  for  aimpoint  and  angular  rate  are  2.40°  and 
1.29  °/sec,  respectively. 

4.4. 1.2  Testing  Using  Superpixellated  Images 

Estimation  using  high  resoluation  images  (442x512)  leads  to  a  rather  large  impact  time 
errors.  Since  the  model  based  estimator  is  best  suited  for  simple  geometry  terrain  without  objects 
and  uses  only  low-frequency  terrain  motion,  superpixellation  of  the  original  images  is  called  for. 
Figures  4.4-8a  and  4.4-8b  show  the  superpixellated  versions  of  the  images  illustrated  in  4.4- la 
and  4.4-2b,  at  a  resolution  of  56  x  64  pixels,  using  8x8  kernels.  The  trucks  are  no  longer 
recognizable,  but  the  low  frequency  imagery  of  the  runway  is. 
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Figures  4.4-9  to  4.4-14  show  the  estimation  time  histories  for  the  helicopter  impact  time 
(altitude/speed),  heading  angle,  flight  path  angle,  and  three  individual  angular  velocity 
components,  respectively.  As  shown  in  figure  4.4-9,  the  impact  time  estimate  now  approaches 
the  true  impact  time.  The  difference  between  the  true  and  the  filtered  impact  time  is  about  0.05 
sec  or  1.6  ft.  This  shows  that  for  overall  terrain  surface  estimation  superpixellation  does  not 
adversely  affect  estimation  accuracy  but  enhances  it. 

Figures  4.4- 10a  and  4.4- 10b  illustrate  the  time  histories  of  the  heading  and  flight  path  angle 
estimates,  respectively.  Superpixellation  does  not  affect  heading  angle  estimation;  but  seems  to 
introduce  a  consistent  negative  flight  path  angle  bias  to  the  flight  path  estimate.  Note  again  how 
the  system  estimates  rapidly  follow  the  trend  of  true  helicopter  motion  states. 

Angular  velocity  estimation  is  not  adversely  affected  by  superpixellation,  as  shown  in  figures 
4.4-1  la,  b,  and  c.  The  three  estimated  angular  velocity  components  track  and  follow  the 
dynamics  of  the  true  angular  velocity  closely. 

Figures  4.4-12,  4.4-13,  and  4.4-14  show  the  time  histories  of  the  impact  time  error,  aimpoint 
error  and  angular  rate  error,  respectively.  After  the  first  30  frames  (1  sec),  the  impact  time  error 
almost  stays  within  a  10%  error  zone,  the  aimpoint  error  stays  below  3°,  and  the  angular  rate 
error  stays  below  27sec.  The  RMS  errors  for  the  impact  time,  aimpoint,  and  angular  rate  are 
7.60%,  3.52°,  and  1.93  °/sec,  respectively. 


Figure  4.4-9:  Impact  Time  Estimation  History 
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Figure  4.4-13:  Aimpoint  Error  History 


Figure  4.4-14:  Angular  Rate  Error  History 


4.4.2  Structure-Based  Estimator 

For  the  structure-based  estimator,  a  256  layer  structure  was  assumed.  The  relative  distance 
between  layers  was  set  at  0.01.  The  Kalman  filter  system  model  standard  deviations  and  the 
discount  factor  were  the  same  as  those  used  in  the  geometry-based  estimator.  Figure  4.4-15  to 
4.4-20  show  time  histories  for  the  estimated  helicopter  impact  time,  heading  angle,  flight  path 
angle,  and  three  individual  angular  velocity  components. 

We  tested  the  estimator  using  both  the  original  images  and  8x8  kernel  superpixllated 
images.  Section  4.4.2. 1  describes  the  results  using  the  original  images,  while  section  4.4. 2. 2 
describes  the  results  using  the  superpixellated  images. 
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4.4.2. 1  Testing  Using  the  Original  Images 

Figure  4.4-15  shows  a  history  of  the  estimated  impact  time.  As  shown,  the  impact  time 
estimate  has  quite  a  large  error.  The  same  two  factors,  "non-decorated"  real  imagery  and  small 
pixel  size,  which  were  described  in  the  geometry-based  estimator,  contribute  to  this  large  error. 
Note  that  the  impact  time  error  is  smaller  than  that  obtained  using  the  geometry  based  estimator. 
Moreover,  the  structure  ba.sed  estimator  was  able  to  extract  more  than  the  terrain  structure. 
Figures  4.4- 16a,  4.4- 16b,  and  4.4- 16c  show  the  fifth,  fifteenth,  and  twenty-fifth  impact  time 
maps  obtained  directly  from  the  estimator.  In  the  figures,  blobs  indicating  the  location  of  trucks 
are  evident  even  without  any  post  image  processing. 


Figure  4.4-15:  Impact  Time  Estimation  History 
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Figure  4.4-16a:  The  Fifth  Impact  Time  Map 
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trucks 


Figure  4.4-16b:  The  Fifteenth  Impact  Time  Map 


trucks 


Figure  4.4-16c:  The  Twenty  Fifth  Impact  Time  Map 


Figures  4.4- 17a  and  4.4- 17b  illustrate  the  time  histories  of  the  estimated  heading  and  flight 
path  angles,  respectively.  Figures  4.18a,  4.18b,  and  4.18c  show  time  histories  of  the  three 
estimated  angular  velocity  components,  respectively.  Note  how  the  estimated  motion  states 
oscillate  around  the  measured  (true)  motion  states.  The  ability  of  the  structure-based  estimator  to 
track  the  helicopter  motion  state  dynamics  is  clearly  demonstrated  in  these  plots. 
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Frame  Count 

Figure  4.4-18c:  Yaw  Angular  Velocity  Estimation  History 

Figure  4.4- 19a  and  4.4- 19b  show  time  histories  of  the  aimpoint  error  and  angular  rate  error, 
respectively.  The  aimpoint  error  stays  between  T  to  2.5°,  while  the  angular  rate  error  stays  below 
3.57sec.  • 


Frame  Count 


Figure  4.4-19a:  Aimpoint  Error  History 
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Figure  4.4-19b:  Angular  Rate  Error  History 


RMS  errors  for  the  filtered  impact  time,  aimpoint,  and  angular  rate  errors  are  61.76%,  4.62J, 
and  2.15j/sec,  respectively. 

4. 4. 2. 2  Testing  Using  Superpixellated  Images 

Figures  4.4-20  to  4.4-23  show  estimation  time  histories  for  the  helicopter  impact  time 
(altitude/speed),  heading  angle,  flight  path  angle,  and  three  individual  angular  velocity 
components,  respectively  using  8x8  kernel  superpixellated  images.  As  shown  in  figure  4.4-20, 
the  impact  time  estimate  now  approaches  the  true  impact  time.  The  difference  between  the  true 
and  the  filtered  impact  time  is  approximately  0.05  sec  or  1.6  ft  most  of  the  time.  As  with  the 
geometry-based  estimator  case,  superpixillation  significantly  improves  the  accuracy  of  the 
estimated  impact  time. 
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Figure  4.4-20:  Impact  Time  Estimation  History 


Figures  4.4-2 la  and  4.4-2 lb  illustrate  time  histories  of  the  heading  and  flight  path  angle 
estimates,  respectively.  Superpixellation  seems  not  affect  the  overall  accuracy  of  the  heading 
estimates.  Note  again  the  ability  of  the  estimator  in  following  the  trend  of  true  helicopter 
heading. 


Figure  4.4-21a:  Heading  Angle  Estimation  History 
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Figure  4.4-21b:  Flight  Path  Angle  Estimation  History 


Figures  4.4-22a,  4.4-22b,  and  4.4-22c  show  histories  of  the  three  estimated  angular  velocity 
components.  Superpixellation  seems  to  reduce  the  ability  of  the  estimator  in  closely  following 
the  trend  of  the  measured  angular  velocity. 


Figure  4.4-22a:  Roll  Angular  Velocity  Estimation  History 
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Figure  4.4-22b:  Pitch  Angular  Velocity  Estimation  History 


Figure  4.4-23c:  Yaw  Angular  Velocity  Estimation  History 

Figures  4.4-24,  4.4-25,  and  4.4-26  show  time  histories  of  the  impact  time  error,  aimpoint 
error  and  angular  rate  error,  respectively.  The  impact  time  error  stays,  most  of  the  time,  within  a 
10%  error  zone,  the  aimpoint  error  stays  almost  below  4.0°,  and  the  angular  rate  error  stays 
below  3.0°/sec. 
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Figure  4.4-26:  Angular  Rate  Error  History 

RMS  errors  for  the  filtered  impact  time,  aimpoint,  and  angular  rate  errors  are  10.02%,  3.93J, 
and  2.07j/sec,  respectively. 

4.5  Summary 

We  systematically  tested  and  evaluated  ATAS  using  the  CGI  and  flight-recorded  real-time 
frame  grabbed  terrain  imagery.  The  evaluation  results  indicate  that  ATAS  has  a  capability  for 
accurate,  efficient,  and  robust  egomotion  and  terrain  shape  estimation  under  various  terrain  (flat 
and  non-flat)  and  motion  (constant  and  non-constant)  conditions,  when  proper  estimators 
(geometry-based  or  structure -based  estimators)  and  proper  pixellation  sizes  are  selected.  This 
capability  is  achieved  through  the  development  of  a  Kalman  filter  based  approach  to  egomotion 
and  terrain  shape  estimation.  The  approach  employs  a  direct  computation  of  the  image  gradient 
measurements  (and  the  flow-field  computations),  a  snapshot  estimate  of  instantaneous  egomotion 
and  terrain  shape  using  only  the  current  image  measurements,  and  a  Kalman  filtering  process  to 
integrate  estimates  over  multiple  image  frames  to  reduce  errors. 

Using  CGI,  we  evaluated  ATAS  for  both  flat  and  non-flat  terrain  images.  With  the  flat 
terrain,  ATAS  had  estimation  errors  of  less  than  2  degree  in  aimpoint,  less  than  3  degree/second 
in  angular  rate,  and  less  than  5%  in  for  altitude  (impact  time). 

With  the  non-flat  terrain  images,  we  used  a  complex  sequence  that  was  generated  by 
simulating  a  flight  over  the  Yosemite  valley.  ATAS  demonstrated  an  excellent  ability  to  recover 
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the  complicated  Yosemite  valley  stracture,  whose  complexity  is  far  beyond  the  capability  of  any 
polynomial  geometry  terrain  models  (flat,  quadratic,  etc.).  Moreover.  ATAS  had  a  less  than  5 
degree  error  in  aimpoint,  and  a  less  than  1  degree/second  error  in  angular  rate,  for  these 
complicated  non-flat  terrain  images. 

For  the  flight-recorded  and  real-time  grabbed  terrain  images,  even  though  they  were  poorly 
“decorated”,  had  very  high  noise-to-signal  ratios,  and  the  observer  (airplane)  was  not  undergoing 
constant  motion,  ATAS  was  still  capable  of  generating  accurate  egomotion  and  terrain  shape 
estimates.  Specifically,  it  demonstrated  errors  of  less  than  5  degree  in  aimpoint,  less  than  2 
degree/second  in  angular  rate,  and  less  than  10%  in  altitude  (impact  time).  Moreover,  ATAS 
showed  an  excellent  ability  in  rapidly  following  and  tracking  the  non-constant  observer  motion. 
Besides  generating  overall  terrain  shape  estimates  such  as  speed-scaled  altitude,  ATAS  was  also 
capable  of  recovering  the  shape  of  local  objects  (in  this  case,  trucks  present  in  the  terrain 
images). 

We  also  compared  the  advantages  and  disadvantages  of  the  geometry-based  and  structure- 
based  estimators,  summarized  in  table  1.3-1. 

The  geometry -based  estimator  can  only  be  applied  to  a  terrain  that  can  be  approximated  by  a 
simple  geometry  model.  Moreover,  using  a  geometry-based  model  to  approximate  the  entire 
terrain  within  the  FOV  makes  it  impossible  to  recover  local  structures  hidden  in  the  terrain 
image,  such  as  isolated  objects.  In  contrast,  the  structure -based  approach  has  no  limitation  in  its 
applicability.  It  can  be  applied  to  both  simple  and  complicated  terrain.  Moreover,  it  can  be  used 
to  recover  the  local  structure  of  isolated  objects  in  the  imager  FOV. 
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5  Summary,  Conclusions,  and  recor.  .endations 

This  chapter  concludes  the  report  with  a  summary,  conclusions,  and  recommendations  for 
further  development  and  demonstration. 

5.1  Summary 

Our  Phase  11  technical  approach  to  developing  and  demonstrating  a  prototype  version  of  the 
passive  sensor  altitude  and  terrain  awareness  system  (ATAS)  was  comprised  of  four  tasks: 

•  Review  of  Current  Approaches  to  Flow-Based  Egomotion  and  Terrain  Shape  Estimation 

•  Enhancement  and  Development  of  a  Prototype  ATAS 

•  Development  of  a  Test  Bed  for  ATAS 

•  Demonstration  and  Evaluation  of  System  Performance  with  CGI  and  flight-recorded 
imagery 

•  Recommendation  of  development  path  for  a  commercial  product 

We  first  reviewed  current  approaches  and  evaluated  their  advantages  and 
disadvantages.  A  literature  search  was  conducted  specifically  focusing  on  flow-field  based 
estimation  technologies.  A  review  and  summary  of  the  most  promising  studies  were  conducted  to 
identify  capabilities  and  limitations  of  earlier  flow-field  based  approaches. 

We  then  designed  and  implemented  a  prototype  ground  based  ATAS  via  specifying  its 
overall  architecture,  its  functional  modules,  and  its  software  implementation.  Our  efforts  were 
focused  on  the  development  of  a  robust,  accurate,  and  efficient  ATAS  that  can  work  in  real-time. 
A  Kalman  filter  based  direct  approach  was  developed  that  comprised  of  three  modules:  1)  an 
image  processor;  2)  a  snapshot  egomotion  and  terrain  shape  estimator;  and  3)  a  Kalman  filter 
providing  smoothed  estimates. 

We  also  developed  and  implemented  a  test  bed  that  has  a  capability  for  flat  and  non-flat 
computer  terrain  image  generation.  We  defined  and  implemented  various  performance  metrics 
for  system  evaluation. 

Eollowing  implementation  of  the  ground  based  ATAS  and  its  test  bed,  we  demonstrated 
and  evaluated  system  performance  via  an  extensive  testing  program.  ATAS  was  tested  using 
both  CGI  and  real  imagery,  and  using  both  the  flat  and  non-flat  terrain  imagery.  It  was  also  tested 
against  human  experiments. 

Finally,  we  recommended  a  development  path  for  a  commercial  product,  based  on  the 


90 


R92191 


Charles  River  Analytics 


results  of  the  evaluation  effort.  The  development  path  includes  an  enhanced  performance  version 
of  the  Phase  II  design,  and  demonstration  and  performance  evaluation  over  an  extended  CGI  and 
real  imagery  database. 

5.2  Conclusions 

The  primary  result  of  this  Phase  II  effort  was  the  successful  enhancement,  implementation, 
and  demonstration  of  the  AT  AS  operation  using  both  computer  generated  and  real  imagery.  The 
major  findings  supporting  this  effort  can  be  summarized  as  follows. 

We  developed  an  overall  architecture  of  ATAS  that  comprises  both  the  real-time  image 
generation  and  real-time  egomotion  and  terrain  shape  estimation.  We  developed  and 
implemented  a  flat-terrain  image  generator  that  produces  CGI  frames  simulating  perspective 
images  viewed  by  an  observer  (camera)  flying  over  flat  terrain  decorated  by  arbitrary  sinusoidal 
patterns.  We  also  developed  and  implemented  a  rolling-terrain  image  generator  that  produces 
video  (analog)  imagery  simulating  images  viewed  by  an  observer  flying  over  a  sinusoidal  rolling 
terrain  decorated  by  sinusoidal  patterns.  In  addition,  from  various  sources  we  obtained  other 
complex  CGI  terrain  images  used  widely  for  computer  vision  algorithm  testing.  For  real  image 
generation,  we  obtained  video  imagery  using  a  video  camera  fixed  on  a  helicopter  flying  over 
real  terrain.  To  convert  the  analog  video  imagery  into  digitized  image  frames,  we  implemented  a 
frame  grabber  using  hardware  and  software  image  grabbing  tools  hosted  by  a  Macintosh  Quadra 
840  AV  computer. 

For  egomotion  and  terrain  shape  estimation,  we  developed  a  three  stage  modular  architecture 
comprised  of  an  image  processor;  a  snapshot  egomotion  and  terrain  shape  estimator;  and  a 
Kalman  filter,  to  provide  smoother  high-accuracy  on-line  estimates.  This  modular  approach 
provided  performance  improvements  and  simplification  for  system  implementation  over  the 
original  design  developed  under  the  Phase  I  effort.  First,  it  avoids  the  complicated  interactions 
between  low-level  image  processing  and  high-level  motion  state  and  terrain  shape  estimation. 
Second,  separation  of  image  processing,  snapshot  estimation,  and  Kalman  filtering  makes  each 
problem  simpler  and  its  solution  easier  and  more  robust.  Third,  the  system  has  a  natural  interface 
to  incorporate  external  information  for  performance  enhancement.  For  example,  if  the  external 
information  on  observer  motion  state  is  available,  it  can  then  be  easily  incorporated  into  the 
Kalman  filter  without  reconfiguration  of  the  entire  system.  Finally,  we  believe  that  this  modular 
architecture  can  provide  the  basis  for  a  functional  stands  model  of  the  human  flow-field  based 
vision  process. 

We  developed  two  types  of  egomotion  and  terrain  shape  estimators  to  deal  with  different 
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kinds  of  terrain  conditions:  a  geometry-based  estimator  and  a  structure-based  estimator.  The 
geometry-based  estimator  assumes  that  the  terrain  shape  satisfies  a  polynomial  geometry,  and 
uses  this  model  for  terrain  shape  estimation.  The  structure  based  estimator  does  not  assume  a 
priori  a  terrain  model,  but  extracts  the  hidden  structural  relationship  among  imaged  terrain  points 
using  a  multi-layer  terrain  structure  model. 

For  static  estimation  in  the  snapshot  estimator  and  dynamic  estimation  in  the  Kalman  filter, 
we  employed  and  implemented  two  state-of-the-art  estimation  algorithms:  the  Singular  Value 
Decomposition  (SVD)  based  Least  Squares  (LS),  algorithm  and  the  Square-Root  Kalman  filter 
algorithms.  The  application  of  these  two  numerically  robust  algorithms  effectively  overcomes 
any  numerical  instabilities  that  might  be  present  in  the  estimation  process. 

Furthermore,  the  operation  of  ATAS  needs  specification  of  only  a  few  meaningful 
parameters.  For  the  geometry  based  approach,  only  four  parameters  need  to  be  specified:  the 
terrain  model,  the  initial  heading  vector,  the  model  standard  deviations  for  motion  state  dynamic 
equations,  and  the  discount  factor  for  terrain  impact  time  (range)  map  evolution:  for  the  structure 
based  approach,  only  the  latter  two  parameters  need  to  be  specified.  In  contrast,  many  more 
parameters  would  be  needed  in  a  flow  or  feature  matching  based  .system,  many  of  which  need  to 
be  determined  by  trial  and  error. 

We  conducted  a  successful  on-line  demonstration  of  the  ATAS  ground-based  prototype, 
driving  the  system  with  CGI  and  flight-recorded  imagery,  generating  running  estimates  of 
altitude  and  attitude  with  respect  to  the  overflown  terrain.  The  evaluation  results  indicate  that 
ATAS  has  a  capability  for  accurate,  efficient,  and  robust  egomotion  and  terrain  shape  estimation 
over  various  terrain  (flat  and  non-flat)  and  under  different  motion  conditions,  when  the  proper 
estimators  (the  geometry-based  estimator  or  the  structure-based  estimator)  and  the  proper 
pixellation  sizes  are  selected. 

Using  CGI,  we  evaluated  ATAS  for  both  flat  and  non-flat  terrain  images.  With  the  flat 
terrain,  ATAS  had  estimation  errors  of  less  than  2  degree  in  aimpoint.  less  than  3  degree/second 
in  angular  rate,  and  less  than  5%  in  for  altitude  (impact  time). 

With  the  non-flat  terrain  images,  we  used  a  complex  sequence  that  was  generated  by 
simulating  a  flight  over  the  Yosemite  valley.  ATAS  demonstrated  an  excellent  ability  to  recover 
the  complicated  Yosemite  valley  structure,  who.se  complexity  is  far  beyond  the  capability  of  any 
polynomial  geometry  terrain  models  (flat,  quadratic,  etc.).  Moreover,  ATAS  had  a  less  than  5 
degree  error  in  aimpoint.  and  a  less  than  1  degree/second  error  in  angular  rate,  for  these 
complicated  non-flat  terrain  images. 
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For  the  flight-recorded  and  real-time  grabbed  terrain  images,  even  though  they  were  poorly 
“decorated”,  had  very  high  noise-to-signal  ratios,  and  the  observer  (airplane)  was  not  undergoing 
constant  motion,  ATAS  was  still  capable  of  generating  accurate  egomotion  and  terrain  shape 
estimate.  Specifically,  it  demonstrated  errors  less  than  5  degree  in  aimpoint,  less  than  2 
degree/second  in  angular  rate,  and  less  than  10%  in  altitude  (impact  time).  Moreover,  ATAS 
showed  an  excellent  ability  in  rapidly  following  and  tracking  the  non-constant  observer  motion. 
Besides  generating  overall  terrain  shape  estimates  such  as  speed-scaled  altitude,  ATAS  was  also 
capable  of  recovering  the  shape  of  local  objects  (in  this  case,  trucks  present  in  the  terrain 
images). 

We  also  compared  the  advantages  and  disadvantages  of  the  geometry  based  and  structure- 
based  estimators,  summarized  in  table  1.3-1. 

The  geometry-based  estimator  can  only  be  applied  to  a  terrain  that  can  be  approximated  by  a 
simple  geometry  model.  Moreover,  using  a  geometry-based  model  to  approximate  the  entire 
terrain  within  the  FOV  makes  it  impossible  to  recover  local  structures  hidden  in  the  terrain 
image,  such  as  isolated  objects.  In  contrast,  the  structure-based  approach  has  no  limitation  in  its 
applicability.  It  can  be  applied  to  both  simple  and  complicated  terrain.  Moreover,  it  can  be  used 
to  recover  the  local  structure  of  isolated  objects  in  the  imager  FOV. 

Large  pixels  can  be  used  for  image  measurement  computation  in  the  geometry-based 
estimator.  Large  pixels  not  only  reduce  the  computation  load  greatly,  but  also  enhance  overall 
estimation  accuracy.  The  structure-based  estimator,  however,  requires  small  pixels  to  properly 
define  the  terrain  structure  and  thus  incurs  additional  computational  costs. 

The  geometry-based  estimator  has  superior  performance  over  the  structure-based  estimator 
when  the  actual  terrain  can  be  approximated  by  a  simple  geometry.  In  this  case,  the  geometry- 
based  model  represents  a  true  description  of  the  terrain  shape  being  imaged.  The  structure-based 
estimator,  however,  always  has  an  approximate  representation  of  the  actual  terrain,  no  matter 
how  simple  or  complicated  it  actually  is. 

Using  ATAS,  we  also  analyzed  several  previous  experimental  studies  of  human  egomotion 
perception.  For  the  two  reported  on  here,  we  used  simple  additive  image  noise  to  model  the 
human's  inability  to  register  perfectly  noise-free  images,  and  we  were  able  to  show  how  this 
perceptual  noise  leads  to  corresponding  inaccuracies  in  estimates  of  egomotion  state.  In 
particular,  we  showed  how,  within  the  model  context,  an  assumed  10%  to  15%  image  noise  level 
leads  directly  to  model-predicted  egomotion  aimpoint  accuracy  in  the  experimentally  observed 
range  of  1.5  to  2.5  degree.  In  addition,  we  showed  how  these  aimpoint  accuracy  can  be  expected 
to  CO- vary  with  velocity-to-height  ratios,  which  change  by  over  a  factor  of  10,  and  confirmed 
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these  trends  with  experimental  data  showing  the  same  trends.  Although  more  model  validation  is 
clearly  needed,  we  feel  that  these  initial  matches  between  model  and  data  are  very  encouraging. 

Finally,  we  identified  requirements  for  production  system  development  and  demonstration. 
Specifically,  we  recommended  that  the  Phase  II  laboratory  prototype  serve  as  the  basis  for  the 
development  of  a  flight  prototype  system,  to  be  used  to  demonstrate  in-flight  capabilities  and 
system  potential.  A  two-step  development  and  demonstration  program  is  envisioned: 
1)  development  and  demonstration  of  a  flight  prototype  system  configured  for  in-flight  test  and 
evaluation;  and  2)  specification  of  design  requirements  for  a  follow-on  production  system. 

The  Phase  II  findings  demonstrated  the  overall  capability  of  ATAS  and  established 
benchmark  criteria  for  evaluating  future  systems.  The  effort  demonstrated  operation  with  off-the- 
shelf  hardware,  at  performance  levels  that  can  significantly  enhance  critical  on-board  functions, 
including  passive  navigation,  optically-based  TF/TA,  and  FLIR-based  target  designation  and 
weapons  control. 

5.3  Recommendations 

On  the  basis  of  these  Phase  II  prototype  evaluations,  we  propose  to  develop  a  Phase  III  real¬ 
time  flight  system  to  meet  specified  performance,  component  compatibility,  and  budget 
constraints.  A  two  step  effort  is  proposed. 

•  The  first  step  would  be  to  transform  ATAS  from  the  current  development  host  to  the  real¬ 
time  host,  to  test  the  system  operation  on  a  terrain-board  simulator,  and  to  make  design 
modification  if  deemed  necessary. 

•  The  second  step  would  be  to  convert  the  ground  based  real-time  host  to  a  flight  rated  real¬ 
time  host,  and  to  conduct  in-flight  evaluations. 

The  first  step  would  focus  on  the  selection  and  configuration  of  the  real-time  host  computer. 
The  key  system  parameter  in  selecting  the  real-time  host  is  its  floating  point  processing  power. 
Two  types  of  real-time  processing  are  required.  The  first  type  is  real-time  frame  grabbing  if  a 
non-digital  camera  system  were  to  be  used  for  terrain  image  generation.  The  second  type  is  the 
real-time  image  processing  for  the  egomotion  and  terrain  shape  estimates.  The.se  have  a  total 
computational  complexity  of  the  order  KN  flops,  where  N  is  the  total  number  of  pixels  and  K  is  a 
coefficient  accounting  for  the  per  pixel  operations  in  image  measurement  computation,  snapshot 
estimation,  and  Kalman  filtering.  As  discussed  earlier,  the  current  development  implementation 
has  an  associated  K  value  of  100. 

Table  5.3-1  illustrates  the  computational  requirements  for  image  processing  of  different  sizes 
of  image  frames,  assuming  K=100,  and  assuming  a  30  Hz  frame  rate. 


94 


R9219I 


Charles  River  Analytics 


Table  5.3-1:  ATAS  Computational  Requirements 


Number  of  Pixels 

Total  Flops 

512  X  480  (standard  NTSC) 

24  Million 

128  X  120  (4  kernel  superpixellation) 

1 .5  Million 

64  X  60  (8  kernel  superpixellation) 

0.4  Million 

Based  on  these  computational  requirement  analysis,  we  determine  that  an  ott-the-shelt  Intel 
586-based  120  MHz  host  computer  can  achieve  the  desired  real-time  requirements. 


Figure  5.3-1  shows  one  potential  architecture  for  a  586  PC-hosted  ATAS  real-time 
production  system  using  an  industry  standard  ISA  bus  to  communicate  with  separate  boards  loi 
frame  grabbing  and  image  processing.  One  particular  advantage  ot  the  PC-based  system  is  the 
large  vendor  base  for  all  three  types  ol  boards.  For  the  Irame  grabber  and  image  co-piocessoi 
boards,  there  are  perhaps  more  than  10  vendors.  We  reviewed  a  number  ol  available  boards  and 
found  that  board  sets  such  as  those  provided  by  Imaging  Technologies  are  particularly  well- 
suited  for  the  proposed  application.  For  example,  the  Imaging  Technologies  VISIONplus-AT 
Advanced  Frame  Grabber  Board  operates  in  real-time  with  standard  30  Hz  NTSC  images,  and 
captures  them,  digitizes  them,  and  makes  them  available  for  further  processing  by  downstream 
boards  or  the  host.  For  follow-on  co-processing  of  the  grabbed  image,  including  the 
SLiperpixellation  function,  an  image  co-processing  accelerator  from  Imaging  Technologies  can 
also  be  used.  A  candidate  board  is  their  33  Mflop  VISIONplus-AT  Image  Processing  Acceleratoi 
(VIPA),  which  is  connected  to  the  front-end  frame  grabber  via  the  high  speed  VISlONbus,  to 
support  real-time  30  Hz  processing  of  512  x  480  images,  including  image  arithmetic,  geometric 
transforms,  and  spatial  filtering  (convolution). 


Figure  5.3-1:  Architecture  for  Real-Time  ATAS 


With  the  image  grabbed  in  real-time  by  the  frame  grabber  board,  the  120  MHz  586  PC  host 
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would  have  enough  power  to  conduct  the  egomotion  and  terrain  shape  .‘.stimation  in  real-time  for 
superpixellated  images.  For  512  x  480  images,  an  additional  array  processor  might  be  neede  '  for 
the  real-time  processing,  as  shown  in  5.3-1. 


Table  5.3-2  shows  the  estimated  costs  for  the  proposed  real-time  system.  Note  that  the  array 
processor  board  may  not  be  necessary.  We  have  thus  indicated  it  as  optional. 

Table  5.3-2:  Estimated  Costs  for  the  Proposed  Real-Time  ATAS 


BSSlIffiidisIliiB 

....  -  . 

faTM.€QST<S) 

120  MHz  586-based  PC  with 

32  Mb  RAM 

1  GbHD 

Quad  Speed  CD-ROM 

3,100 

Frame  Grabber  Board 

IT  Vision  Plus  AT  and  software 

3,000 

Image  CO  processor  Board 

IT  Vision  Plus  AT  and  software 

7,000 

13,100 

Array  Processor  Board  (optional) 

11,000 

24,  100 

The  second  step  would  focus  on  the  conversion  of  the  ground-based  real  time  ATAS  to  a 
flight  rated  real  time  system.  Four  subtasks  are  involved:  1)  a  specification  and  configuration  of 
the  586  PC  host  to  meet  MIL-SPEC  requirements;  2)  an  integration  of  the  host  with  the  on-board 
imaging  sensor;  3)  porting  and  checkout  of  the  flight  system;  and  4)  flight  testing  to  determine 
the  system  performance. 

For  the  Phase  III  flight  system  (FS)  of  ATAS,  we  propose  a  three-bus  architecture,  as 
illustrated  in  figure  5.3-2.  The  DoD  MIL-STD-1553B  Data  Bus  for  avionics  is  shown  as  the  top¬ 
most  bus,  interlinking  the  ATAS  real-time  production  system  (shown  in  the  dashed  box)  with  the 
other  on-board  systems.  The  Industrial  Standard  Architecture  (ISA)  bus  is  shown  as  the  middle 
bus,  and  serves  to  interconnect  several  required  boards  on  a  PC  host.  The  VISIONBUS,  a 
proprietary  bus  by  Imaging  Technologies,  Inc.,  is  shown  as  the  bottom-most  bus,  and  serves  to 
interconnect  image  processing  boards  requiring  very  high  data  bandwidths. 
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Figure  5.3-2:  Proposed  Architecture  for  ATAS  Flight  Prototype  System  (FPS) 


The  1553B  data  bus  interlinks  the  ATAS  processor  with  the  several  remote  terminals 
serving  other  on-board  systems.  The  purpose  of  this  bus  is  to  provide  access  to  on-board 
measurements  and  system  parameters  to  validate  ATAS  operation  and  estimation  accuracy. 
Specifically  shown  is  the  air  data  computer  (ADC)  which  provides  barometric  altitude,  the 
inertial  navigation  system  (INS)  which  provides  vehicle  altitude,  location,  and  ground  speed,  the 
RADAR  altimeter  subsystem  which  provides  actual  altitude  for  verification  purposes,  and  an 
ATAS  display,  for  optional  display  of  ATAS-generated  terrain  profiles.  Although  not  shown  on 
the  diagram,  it  is  anticipated  that  a  terrain  following  radar  (TFR)  system,  if  available,  would  also 
be  accessed,  to  allow  for  the  direct  comparison  of  TFR-generated  terrain  shape  estimates  with 
those  generated  by  the  ATAS  processor. 

The  1553B  data  bus  itself  is  defined  formally  by  MIL-STD-1553B  (1978),  which  establishes 
the  requirements  for  digital  data  buses  on  military  aircraft.  The  standard  specifies: 
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1.  Types  of  terminals:  bus  controller,  bus  monitor,  and  remote  terminal 

2.  Bus  protocol,  including  message  formats  and  word  structure 

3.  Hardware  performance  specifications,  such  as  characteristic  impedance  and  connection 
requirements. 

The  proposed  FS  will  interface  with  the  data  bus  via  a  Remote  Terminal,  which  will  respond 
only  to  commands  specifically  addressed  to  it.  We  plan  to  embed  the  remote  terminal  within  the 
processor  subsystem,  for  maximum  interfacing  autonomy.  The  proposed  system  will  use  the  bus 
protocol  detailed  by  the  1553  message  formats,  or  information  transfer  formats.  The  ISA  bus 
shown  in  figure  5.3-2  serves  as  the  backbone  of  the  ATAS  FPS,  outlined  by  th®  dashed  box  in 
the  figure.  Currently  under  development  by  the  Crew  Systems  Integration  Branch  of  the 
Armstrong  Laboratory  (AL/CFHI),  as  part  of  the  Flight  Computer  Project  (FCP),  is  an  ISA-based 
computer  which  will  directly  interface  to  the  1553B  bus  as  a  remote  terminal  (Turner,  1992).  We 
propose  its  use  during  the  Phase  11  effort,  as  the  host  processor  for  the  ATAS  FPS. 

The  FCP  computer  consists  of  an  Intel  80586  processor  operating  at  120  MHz,  32  Mb  RAM, 
one  Gig  hard  drive,  a  Quad  Speed  CD-ROM  drive,  and  serial  and  parallel  ports.  The  interface 
between  the  1553  and  ISA  buses  is  a  1553  Bus  Twin- Ax  Connector,  a  BN73  Tee  Connector,  and 
a  BJ77  Backpanel  Connector.  Available  within  the  system  box  are  three  (16-bit)  ISA  slots,  which 
we  propose  to  use  for  the  image  processing  cards,  as  shown  in  the  figure. 

The  FCP  computer  holds  another  advantage  besides  the  hybrid  ISA/1553  bus  structure:  fully 
integrated  into  the  computer  is  a  Global  Positioning  System  (GPS),  which  can  provide 
worldwide  navigation  coverage  to  an  accuracy  of  a  few  meters.  The  satellite-based  system 
operates  via  transmission  of  precisely  timed  signals  to  a  ground-based  GPS  receiver.  Using 
known  satellite  positions,  and  triangulation  over  multiple  satellites,  the  GPS  receiver  computes 
the  unknown  signal  transmit  delay  and  the  receiver's  unknown  location.  This  is  then  made 
available  to  the  host  system  for  further  processing.  Six  orbital  planes  with  three  satellites  per 
plane  provide  round-the-clock  coverage.  GPS  position  could  be  used  to  generate  local  terrain 
elevation  during  flight  tests,  if  the  overflights  are  made  over  surveyed  regions,  and  if  the  survey 
is  available  in  digital  format.  Combining  a  map-based  elevation  profile  with  the  instantaneous 
measurement  available  from  a  radar  altimeter  would  then  allow  us  to  accurately  compute  actual 
elevation  above  the  terrain,  for  later  off-line  verification  of  system  operation. 

Two  options  for  the  display  hardware  are  possible.  First,  the  information  to  be  displayed 
could  be  sent  via  the  1553B  data  bus  to  an  ATAS  display  connected  to  the  bus.  This  will  be  an 
attractive  option  if  the  flight  prototype  is  equipped  with  a  general  purpose  multi-function  CRT 
display.  The  second  option  is  to  display  the  information  on  a  dedicated  ISA-based  video  display 
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which  is  part  of  the  FCP  computer.  The  main  advantage  of  this  option  is  that  there  is  no  need  to 
interface  with  existing  flight  hardware,  and  all  display  development  could  be  accomplished  on 
the  ground-based  demonstrator,  since  the  same  ISA-based  display  hardware  would  be  available. 

We  propose  to  evaluate  performance  of  the  Phase  III  FS  using  recorded  synthetic  and  real- 
image  sequences,  and  via  in-flight  testing. 

The  recorded  image  sequences  will  include  all  those  used  in  the  GBD  evaluation  effort, 
including:  Phase  II  image  sequences,  CGI  sequences  recorded  for  flat  and  rolling  terrain,  high- 
fidelity  visual  flight  simulator  image  sequences,  and  flight  recorded  sequences  available  through 
the  NASA  Ames  Research  centers.  Direct  comparison  with  the  previous  GBD  test  results  will 
serve  to  validate  operation  of  the  FS,  prior  to  flight  testing.  Demonstration  of  in-line  real-time  FS 
operation  will  then  proceed,  with  the  flight  demonstration  portion  of  the  effort. 

A  variety  of  aircraft  are  under  consideration  for  the  Phase  III  in-flight  demonstration.  These 
include  the  DC-3  LLTV/FLIR-equipped  imager  testbed  maintained  by  Texas  Instruments,  Inc., 
an  F-16  operated  by  General  Dynamics  providing  equipped  F-16  operated  by  the  6516th  Test 
Squadron  at  Edwards  AFB,  several  potential  platforms  operated  by  the  4950th  Test  Wing,  and 
the  NASA  Shuttle  Telescope  testbed.  In  addition,  a  variety  of  private  non-government  platforms 
could  be  readily  configured  for  the  ATAS  flight  demonstration;  several  candidates  are  currently 
under  evaluation. 

The  flight  demonstration  will  involve  an  initial  effort  in  installation  and  checkout  in  the  host 
aircraft,  in-flight  demonstration  of  a  range  of  maneuvers  and  terrain  conditions  chosen  to  test  the 
system  to  its  limits,  and  post-flight  analysis  of  system  performance.  We  recognize  that 
demonstration  of  in-flight  FPS  operation  will  present  a  significant  challenge,  primarily  because 
the  general  difficulty  in  conducting  flight  simulations  of  any  type.  However,  we  plan  to  conduct 
extensive  pre-flight  simulation  of  ATAS  operation,  both  under  the  FS  design  effort,  and  under 
the  GBD  effort.  This  will  be  aimed  at  defining  a  minimal  flight  test  sequence  which  will  best 
demonstrate  ATAS  capabilities  and  limitations  in  flight. 
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Appendix  A:  Image  Gradient  Error  Analysis 

We  first  consider  image  gradient  errors  induced  by  spatial  and  temporal  quanc’  /ation.  From 
the  two-point  ceneter  formulas  (3.1-la,b,  c),  we  can  see  that  spatial  quantization  wi  ;d  affect  the 
spatial  gradient  error  through  the  spatial  steps  Axi  and  Ays,  and  the  temporal  quantization  would 
affect  the  temporal  gradient  error  through  the  temporal  step  At.  We  start  with  the  analysis  of 

spatial  gradient  error.  Let  us  assume  that  the  same  pixel  size  is  adopted  in  x  and  y  directions,  i.e., 
Axj  =  Ayj  =  Aj.  When  the  three  point  formula  is  used  to  estimate  the  gradient  of  a  continuous 

function  f(x),  the  estimation  error  is  given  by 

A2 

A[f(x)]=  nx  +  ^)  (A-1) 

where  A  is  the  differential  step  and  ^  lies  between  -A  and  A  (Burden  &  Faires,  1989).  Using  (A- 
1),  we  obtain  the  spatial  gradient  error 

T 

^q(gi)  =  -  y-v  t),  Iy3(Xi,  yj  -I-  C,  t)]  (A-2) 

where  -  C  —  ^i- 


We  continue  to  determine  the  temporal  gradient  error.  Let  Vj  =  (Xj,  yi)be  the  image  flow  at 
pixel  i.  In  the  observer's  frame  of  reference,  the  temporal  intensity  change  at  pixel  i  can  be 
regarded  as  a  result  of  the  image  intensity  profile  moving  with  the  velocity  -Vj.  In  other  words, 
the  image  intensity  function  can  also  be  represented  by 

I(Xi,  yj,  t)  =  I(Xi  -  Xjt,  yj  -yjt)  (A-3) 

The  temporal  gradient  error  induced  by  temporal  quantization  is  then  given  by 
^  (At)^ 

Aq(Iit)  =  -  -  Xjt,  y;  -  yjl) 

2  3 

(^t)  V  r-n/"  /  ■  •  ,-3-n-n  .  *  ^ 

=  L  Cgl^B-n  y„(Xj  -  Xji,  yj  -  yii)Xi  (A-4a) 

n  =  0 

where  C3  is  the  combinatorial  coefficient. 


I^3-„yn(Xi-x,i,  yi-yii)  = 


a^I(Xi  -  Xjt,  yj  -  yji) 

9x^""  3y" 


(A-4b) 


and  -At  <  i  <  At. 


A-l 
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The  actual  amount  of  error  in  either  spatial  or  temporal  gradient  depends  on  the  value  of  E , 
^ ,  or  i.  For  small  A,  and  At ,  we  assume  that  the  probability  of  either  £  (^ )  or  i  being  at  a  certain 
place  within  the  interval  [-A,  ,  Aj  ]  or  [-At,  At]  follows  a  uniform  probability  distribution.  The 
expected  spatial  and  temporal  gradient  errors  at  pixel  i  are  then  obtained  as: 

__  Aj  ,ff  .j. 

Aq(gi)  =  -  -^[1x3  (Xj,  yi),  Iy3  (Xj,  yj)] 


Ailit) 


^ '  3-n '  n 

Cslx^n  „  (Xj,  yj)x,  yj 


(A-5a,  b) 


Under  the  same  probabilistic  view  of  the  image  gradient  error,  we  can  also  estimate  the 
image  gradient  error  variances.  Specifically,  the  variance  for  the  x-direction  spatial  gradient  is 
estimated  by 

o2(L)  =  E[AJLx)-AJI,x)]2 


=  ^  E[I"3  (x,  + 1  Yi)  -  I'xkxi,  y,)]^ 


Af[Ix^  (X.,  Yi)]- 
108 


(A-6a) 


Similarly,  we  have  the  variances  for  the  y-direction  spatial  gradient  and  temporal  gradient  as 


Af[V(x„yi)]2 


(A-6b) 


(At)^[  .  (X,,  yi)x^"yf]^ 


(T^n  'v  a _ n  =  0 _ 

~  1  n 


(A-6c) 


We  now  consider  the  image  gradient  error  induced  by  intensity  quantization.  Let 
Ag[I(Xi,  Yj,  t)]  represent  the  intensity  quantization  error.  We  can  directly  compute  the  image 

gradient  error  from  the  three  point  formula: 

A*(Ii)  .  A^CIj)  .  A'di) 

Ag(Iix)  -  2Aj  ’  “  2Aj  ’  Ag(Iit)  -  2At 

where 


Ag(Ii)  =  Ag[I(Xi  +  Ax,  Yj,  t  -  At)]  -  Ag[I(Xj  -  Ax,  yj,  t  -  At)] 
A^dj)  =  Ag[I(Xj,  Yj  +  Ay,  t  -  At)]  -  Ag[I(Xi,  y  j  -  Ay,  t  -  At)] 
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A^(Ii)  =  Ag[I(Xi,  Yi,  t)]  -  Ag[I(Xi,  Yi,  t  -  2At)]  (A-7a,  b,  c,  d) 

Let  G  denote  the  greY  levels  used  to  represent  image  intensitY-  Assuming  that  the  intensitY 
quantization  error  follows  a  uniform  distribution  within  the  quantization  interval  [- 

2G  2G 

we  obtain  the  average  errors 

Vlix)  =  Vliy)  =  Ag(iit)  =  0  (A-8) 

and  the  variances: 


=  (^gCIiy)  = - ^  and  = 

^  ®  24G^A^  ® 


(A-9) 


using  the  computation  formulas  given  bY  Kamgar-Parsi  (1989). 
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Appendix  B:  Optimal  Assignment  Algorithm 

The  optimal  assignment  algorithm  is  derived  using  Sherman-Morris-  n  formula  (Press  et  al., 
1992)  to  compute  the  inverse  of  a  matrix  when  one  of  its  rows  or  column-  is  changed.  For  matrix 
A  and  some  column  vectors  u  and  v,  the  Sherman-Morrison  formula  is  given  as 


1  +  v*u 


(B-1) 


We  now  develop  an  iterative  algorithm  that  optimally  assigns  N  terrain  points  among  M 
structure  layers,  one  point  at  each  iteration.  Let  n  be  the  current  iteration,  aj*  be  the  point  i's 
current  assignment,  and  e({af})  be  the  current  projection  length.  At  iteration  n+1,  we  change  the 
point  i's  assignment  to  af  *  =  1,2, ...,  M.  Denote 

B"  s  |a"  =  hlCB")- yf  =  b"(B")“  ’h;  (B-2) 

Using  the  Sherman-Morrison  formula  to  compute  the  matrix  inverse 
[B"^  +((a["'^^^)^  -  (af)^)hjh|]“  ^ ,  we  have  the  projection  length  at  iteration  n-t-1 

e({af+' })  =  +  bit(af+i  -  af)h[][B"'^  +((a["+i))2  -  (af)2)hih[]-  -  afjhj  -H  b"] 

ppbf,(af  ^  1  -  af)^  +  2 TfbjtCaf  ^ »  -  af)  -  (7f)^[(a["^i))^  - (af)^] 

^  1  +  lif(a["+B)2  _  (an)2] 


=  e(fa'’l)  +  ^  -  aff  +  2ih^^  -  af-jf )yf(af  +  ^  -  af) 

^  pfCaf  +  *  -  ap)2  -I-  2|if apCaf  +  i  -  af )  -i- 1 


(B-3) 


We  want  to  select  an  assignment  aj''^*  that  maximizes  e({aj‘'^'}).  To  do  so,  we  find  the 
assignment  aj^  *  that  has  zero  gradient 

ae({ar  +  >}) 

aaH+i 

_  ^  [aj^lifbj  -  7i^]pS^bit(af  +  ^  ^  -  af)  +  [bj,  -  a|^]yj^ 

+  ‘  -  +  2(i|'aH(aH  +  >  -  aj^)  +  l]^  (3.4) 

Two  zero  gradient  solutions  can  be  found  by  solving  the  quadratic  function 

[al^lif'bi  -  "iflpfb itCal"  +  ^  -  af)^  -H  [^fb?  -  -  af)  -i-  [b jj  -  a^f lyi"  =  0  (3.5) 


Figure  B-1  illustrates  the  geometry  of  e({a|'+  *})  function  that  has  two  zero  gradient  points. 
From  the  figure  and  basic  function  optimization  theory,  we  can  see  that  the  optimal  assignment 
of  al'  *  is  to  assign  af  ^  to  one  of  the  zero  gradient  layer  that  increases  the  projection  length. 
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Appendix  C:  Motion  State  and  Terrain  Shape  Representation  in  a  Selected 

Coordinate  System 

The  egomotiori  and  terrain  shape  estimation  system  estimates  the  motion  state  and  terrain 
shape  relative  to  the  screen  coordinate  system.  The  estimated  motion  state  and  terrain  shape  can 
be  represented  in  any  observer-fixed  coordinate  system  via  a  coordinate  system  transformation, 
as  shown  in  figure  C-1. 

Q 


Figure  C-1  shows  the  z-axis  coordinates  '^'iz  ^  terrain  point  Pj  in  the  screen  and  a 

selected  observer-fixed  coordinate  system,  such  as  body  coordinate  system.  Let  Uz  =  (0,  0, 1)^  be 
the  z-axis  unit  vector  of  the  selected  coordinate  system,  and  Tsa  be  the  transformation  matrix 
between  the  screen  and  the  selected  coordinate  system.  The  relationship  between  xf2  and  Xj2  is 
given  by  the  projection  equation 

=  (C-1) 

Let  O®  and  ta®  be  the  angular  velocity  and  heading  vectors  in  the  screen  coordinate  system, 
respectively.  Their  representations  in  the  selected  coordinate  system  are  given  by 
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ISy  (C-3) 

Furthermore,  the  operation  of  ATA.S  needs  specification  of  only  a  few  meaningful 
parameters.  For  the  geometry-based  approach,  only  four  parameters  need  to  be  specified:  the 
terrain  model,  the  initial  heading  vector,  the  model  standard  deviastions  for  motion  state  dynamic 
equations,  and  the  discount  factor  for  terrain  impact  time  (range)  map  evolution;  for  the 
structure-based  approach,  only  the  latter  two  parameters  need  to  be  specified.  In  contrast,  many 
more  parameters  would  be  needed  in  a  flow  or  feature  matching  based  system,  many  of  which 
need  to  be  determined  by  trial  an,d  error. 
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